I am reading data from a file that has, unfortunately, two types of character encoding.
There is a header and a body. The header is always in ASCII and defines the char
I suggest rereading the stream from the start with a new InputStreamReader
. Perhaps assume that InputStream.mark
is supported.
My first thought is to close the stream and reopen it, using InputStream#skip
to skip past the header before giving the stream to the new InputStreamReader
.
If you really, really don't want to reopen the file, you could use file descriptors to get more than one stream to the file, although you may have to use channels to have multiple positions within the file (since you can't assume you can reset the position with reset
, it may not be supported).
It's even easier:
As you said, your header is always in ASCII. So read the header directly from the InputStream, and when you're done with it, create the Reader with the correct encoding and read from it
private Reader reader;
private InputStream stream;
public void read() {
int c = 0;
while ((c = stream.read()) != -1) {
// Read encoding
if ( headerFullyRead ) {
reader = new InputStreamReader( stream, encoding );
break;
}
}
while ((c = reader.read()) != -1) {
// Handle rest of file
}
}
Here is the pseudo code.
InputStream
, but do not wrap a
Reader
around it.ByteArrayOutputStream
.ByteArrayInputStream
from
ByteArrayOutputStream
and decode
header, this time wrap ByteArrayInputStream
into Reader
with ASCII charset.ByteArrayOutputStream
.ByteArrayInputStream
from the second
ByteArrayOutputStream
and wrap it
with Reader
with charset from the
header.If you wrap the InputStream and limit all reads to just 1 byte at a time, it seems to disable the buffering inside of InputStreamReader.
This way we don't have to rewrite the InputStreamReader logic.
public class OneByteReadInputStream extends InputStream
{
private final InputStream inputStream;
public OneByteReadInputStream(InputStream inputStream)
{
this.inputStream = inputStream;
}
@Override
public int read() throws IOException
{
return inputStream.read();
}
@Override
public int read(byte[] b, int off, int len) throws IOException
{
return super.read(b, off, 1);
}
}
To construct:
new InputStreamReader(new OneByteReadInputStream(inputStream));
Why don't you use 2 InputStream
s? One for reading the header and another for the body.
The second InputStream
should skip
the header bytes.