The aim of these exercises is to manipulate ByteBuffer. As we have not seen any networking primitives, we will use files to read/write bytes.
In these exercises, you can write all your code in the main method. We will do better in the future but it is not the goal of these exercises.
StoreWithByteOrder that reads long integers from the keyboard and writes them into a file. This program will take two arguments on the command line:
LE for little-endian and BE for big-endian;For example, calling java StoreWithByteOrder LE foo.bin will write the longs in the file foo.bin in little-endian.
Starting from the template StoreWithByteOrder.java, write the program StoreWithByteOrder.
If you run the following command
% java StoreWithByteOrder BE long-be.bin 1 ^Dyou should obtain a file long-be.bin containing 7 bytes of value 0 followed by 1 byte of value 1. To see the value of the bytes contained in a file, we provide a tool File2Hex.jar that gives the value of each byte of the file in hexadecimal. Using this tool, you should obtain:
% java -jar File2Hex.jar long-le.bin 00 00 00 00 00 00 00 01
If you run the following command
% java StoreWithByteOrder LE long-le.bin 1 ^Dyou should obtain a file long-le.bin containing 1 byte of value 1 followed by 7 bytes of value to 0. Using this
File2Hex.jar, you should obtain:
% java -jar File2Hex.jar long-le.bin 01 00 00 00 00 00 00 00
We want to write a program ReadFileWithEncoding that takes two parameters:
charset,
The Java API offers the method String Files.readString(Path path, Charset cs) that do exactly that.
However in this exercise, we ask you to access the file via a FileChannel which only gives the ability to read/write the raw bytes from/to the file and use the method charset.decode to construct the string.
In the template ReadFileWithEncoding.java, write the method stringFromFile using a FileChannel.
You can obtain the size in bytes of a file using the method fileChannel.size().
Warning: Even if the buffer buff has the same size as the file, the method
fileChannel.read(buff) does not guarantee to fill buff in one call. You have to call it until the buffer is full (i.e., !buffer.hasRemaining()).
You can test your program with the file test.txt. You need to download the file using "Save as" and not copy-pasting it. If your code is correct, you should obtain:
% java fr.uge.net.buffers.ReadFileWithEncoding utf8 test.txt a€and
% java fr.uge.net.buffers.ReadFileWithEncoding iso-8859-1 test.txt aâ ¬
There is no magic behind this. The file test.txt contains the 5 bytes 61 E2 82 AC 0A. The file itself as no prefered encoding.
If we decode it using the UTF8 charset, these 5 bytes are interpreted as follows:
61 -> a E2 82 AC -> € 0A -> line returnThe characters in UTF-8 are represented by a variable number of bytes. You can learn here how this feature is achieved.
In the iso-8859-1 charset, each character is coded by one byte and the 5 bytes are interpreted as follows:
61 -> a E2 -> â 82 -> control caracter that cannot be printed AC -> ¬ 0A -> line return
In this exercice, we want to write a program ReadStandardInputWithEncoding which reads bytes from the standard input and decodes them in a given charset. This program takes the name of the charset as input.
The program is meant to be used as follows:
$ cat test.txt | java ReadStandardInputWithEncoding utf8
Usually you access the standard input using a Scanner but in this exercise, we ask you to access it as a stream of bytes. We can obtain a ReadableByteChannel corresponding to the standard input with:
ReadableByteChannel in = Channels.newChannel(System.in);
A ReadableByteChannel behaves like a FileChannel
when reading except that we do not know in advance the total number of bytes. You know that you have read all of the bytes when the method readableByteChannel.read returns -1. You will need to read in a fixed-size buffer and extend this buffer when it is full.
In the template ReadStandardInputWithEncoding.java, write the method stringFromStandardInput which read all the bytes from the standard input and return the corresponding string.
To increase the size of the buffer, we will create a buffer twice as large and copy the data from the old buffer into the new one.
ByteBuffer can be obtained using the method byteBuffer.capicity().
ByteBuffer src at the beginning of the work-zone of a ByteBuffer dst, we will dst.put(src).
To test your code, you can use:
% cat test.txt | java fr.uge.net.buffers.ReadStandardInputWithEncoding utf8
To test that you correctly increase the size of your buffer, you can use the file test2.txt.
% cat test2.txt | java fr.uge.net.buffers.ReadStandardInputWithEncoding utf8
% cat test2.txt | wc 10000 40000 2208890 % cat test2.txt | java fr.uge.net.buffers.ReadStandardInputWithEncoding utf8 | wc 10000 40000 2208890