We are working to reduce the latency and increase the performance of a process written in Java that consumes data (xml strings) from a socket via the readLine() method of the Bu
Will the inputReader.readLine() method return as soon as it hits the \n character or will it wait till the buffer is full?
Is there a faster of picking up data from the socket than using a BufferedReader?
BufferedReader entails some copying of the data. You could try the NIO apis, which can avoid copying, but you might want to profile before spending any time on this to see if it really is the I/O that is the bottleneck. A simpler quick fix is to add a BufferedInputStream
around the socket, so that each read is not hitting the socket (It's not clear if InputStreamReader does any buffering itself.) e.g.
new BufferedReader(new InputStreamReader(new BufferedInputStream(in)))
What happens when the size of the input string is smaller than the size of the Socket's receive buffer?
What happens when the size of the input string is bigger than the size of the Socket's receive buffer?
To sum up, BufferedReader blocks only when absolutely necessary.
If you know the character-encoding of the incoming data you may want to write your own class that performs reading of binary data, looking for your specific end-of-line terminator. This may remove a lot of unnecessary encoding/decoding and copying. Make sure you implement something with a re-usable buffers (e.g. NIO's CharBuffer
or ByteBuffer
classes would come to mind, or a correctly initialized StringBuilder
if you need String
instances). Make sure you've got enough space in the buffer, 32Ki to 64Ki is nothing for current computers.
Once you've gotten the data in a usable container you can use any trick in the book (multiple threads, executors etc.) to handle the data efficiently. Remember, the only way to slow down a current CPU is to hit cache-misses - large/dynamic data sets, spurious copying - or branches - unnecessary loops, if
statements and what's more and of course kernel calls and I/O.
The answer to your first question is yes and no. If the buffer already contains the line terminator it will return immediately, however if it does not contain the terminator then it will try to fill the buffer, but not necessarily fully. It will only read until there is some new data (at least one char) or EOF is reached.
One of the nice things about java is that the libraries are open source, so if you have a full copy of the JDK you can look at the source yourself to answer these types of questions. I use eclipse as my IDE and by default if you place the cursor over a class name and press F3 it will take you to the source (this is how I obtained the answer above). The caveat is with the standard distribution the source for some of the internal classes / native code is not available.
For your second question, I would say generally no, as the logic used by BufferedReader is the generally the same any code would need to recreate to achieve the same task. The only thing that might slow BufferedReader is internally it uses a StringBuffer, which is synchronized, instead of the unsynchronized StringBuilder.
One of the advantages of the BufferedReader is that it provides a layer of separation (the buffer) between the input methods (read, readLine, etc.) you use and the actual socket reads, so you don't have to worry about all the cases like "most of the line is in the buffer, but you need to read another buffer to get the \n" etc.
Have you done performance measurement that indicates that using a BufferedReader is a performance issue for your application? If not, I would suggest that you start by choosing an input method which provides the functionality you want (line-based input terminated by \n's, from the sound of it), and worry about if there's a "faster" way to do it only if you find the input method is a bottleneck.
If line-based input is really what you're after, you're going to end up using some kind of buffer like BufferedReader does, so why re-invent this wheel?