The aim of these exercises is to manipulate ByteBuffer
. As we have not seen any networking primitives, we will use files to read/write bytes.
In these exercises, you can write all your code in the main
method. We will do better in the future but it is not the goal of these exercises.
StoreWithByteOrder
that reads long
integers from the keyboard and writes them into a file. This program will take two arguments on the command line:
LE
for little-endian and BE
for big-endian;For example, calling java StoreWithByteOrder LE foo.bin
will write the longs in the file foo.bin
in little-endian.
Starting from the template StoreWithByteOrder.java, write the program StoreWithByteOrder
.
If you run the following command
% java StoreWithByteOrder BE long-be.bin 1 ^Dyou should obtain a file long-be.bin containing 7 bytes of value 0 followed by 1 byte of value 1. To see the value of the bytes contained in a file, we provide a tool File2Hex.jar that gives the value of each byte of the file in hexadecimal. Using this tool, you should obtain:
% java -jar File2Hex.jar long-le.bin 00 00 00 00 00 00 00 01
If you run the following command
% java StoreWithByteOrder LE long-le.bin 1 ^Dyou should obtain a file long-le.bin containing 1 byte of value 1 followed by 7 bytes of value to 0. Using this
File2Hex.jar
, you should obtain:
% java -jar File2Hex.jar long-le.bin 01 00 00 00 00 00 00 00
We want to write a program ReadFileWithEncoding
that takes two parameters:
charset,
The Java API offers the method String Files.readString(Path path, Charset cs)
that do exactly that.
However in this exercise, we ask you to access the file via a FileChannel
which only gives the ability to read/write the raw bytes from/to the file and use the method charset.decode
to construct the string.
In the template ReadFileWithEncoding.java, write the method stringFromFile
using a FileChannel
.
You can obtain the size in bytes of a file using the method fileChannel.size()
.
Warning: Even if the buffer buff
has the same size as the file, the method
fileChannel.read(buff)
does not guarantee to fill buff
in one call. You have to call it until the buffer is full (i.e., !buffer.hasRemaining()
).
You can test your program with the file test.txt. You need to download the file using "Save as" and not copy-pasting it. If your code is correct, you should obtain:
% java fr.uge.net.buffers.ReadFileWithEncoding utf8 test.txt a€and
% java fr.uge.net.buffers.ReadFileWithEncoding iso-8859-1 test.txt aâ ¬
There is no magic behind this. The file test.txt contains the 5 bytes 61 E2 82 AC 0A. The file itself as no prefered encoding.
If we decode it using the UTF8 charset, these 5 bytes are interpreted as follows:
61 -> a E2 82 AC -> € 0A -> line returnThe characters in UTF-8 are represented by a variable number of bytes. You can learn here how this feature is achieved.
In the iso-8859-1 charset, each character is coded by one byte and the 5 bytes are interpreted as follows:
61 -> a E2 -> â 82 -> control caracter that cannot be printed AC -> ¬ 0A -> line return
In this exercice, we want to write a program ReadStandardInputWithEncoding
which reads bytes from the standard input and decodes them in a given charset. This program takes the name of the charset as input.
The program is meant to be used as follows:
$ cat test.txt | java ReadStandardInputWithEncoding utf8
Usually you access the standard input using a Scanner
but in this exercise, we ask you to access it as a stream of bytes. We can obtain a ReadableByteChannel
corresponding to the standard input with:
ReadableByteChannel in = Channels.newChannel(System.in);
A ReadableByteChannel
behaves like a FileChannel
when reading except that we do not know in advance the total number of bytes. You know that you have read all of the bytes when the method readableByteChannel.read
returns -1. You will need to read in a fixed-size buffer and extend this buffer when it is full.
In the template ReadStandardInputWithEncoding.java, write the method stringFromStandardInput
which read all the bytes from the standard input and return the corresponding string.
To increase the size of the buffer, we will create a buffer twice as large and copy the data from the old buffer into the new one.
ByteBuffer
can be obtained using the method byteBuffer.capicity()
.
ByteBuffer src
at the beginning of the work-zone of a ByteBuffer dst
, we will dst.put(src)
.
To test your code, you can use:
% cat test.txt | java fr.uge.net.buffers.ReadStandardInputWithEncoding utf8
To test that you correctly increase the size of your buffer, you can use the file test2.txt.
% cat test2.txt | java fr.uge.net.buffers.ReadStandardInputWithEncoding utf8
% cat test2.txt | wc 10000 40000 2208890 % cat test2.txt | java fr.uge.net.buffers.ReadStandardInputWithEncoding utf8 | wc 10000 40000 2208890