Java: Efficiency of the readLine method of the BufferedReader and possible alternatives

前端未结

关注

 4  2001

We are working to reduce the latency and increase the performance of a process written in Java that consumes data (xml strings) from a socket via the readLine() method of the Bu

相关标签:

4条回答

孤城傲影

2021-02-13 22:36
Will the inputReader.readLine() method return as soon as it hits the \n character or will it wait till the buffer is full?
- It will return as soon as it gets a newline.
Is there a faster of picking up data from the socket than using a BufferedReader?
- BufferedReader entails some copying of the data. You could try the NIO apis, which can avoid copying, but you might want to profile before spending any time on this to see if it really is the I/O that is the bottleneck. A simpler quick fix is to add a BufferedInputStream around the socket, so that each read is not hitting the socket (It's not clear if InputStreamReader does any buffering itself.) e.g.
  
  new BufferedReader(new InputStreamReader(new BufferedInputStream(in)))
What happens when the size of the input string is smaller than the size of the Socket's receive buffer?
- The BufferedReader will fetch all the data availalbe. It will then scan this data to look for the newline. The result is that subsequent reads may already have the data in the BufferedReader.
What happens when the size of the input string is bigger than the size of the Socket's receive buffer?
- The bufferedReader will read what is in the recieve buffer, and as there is no newline or the end of stream is reached, it will continue to read data from the socket until it finds EOF or a newline. Subsequent reads may block until more data becomes available.
To sum up, BufferedReader blocks only when absolutely necessary.
0 讨论(0)
发布评论:

提交评论
- 加载中...
情歌与酒

2021-02-13 22:41

If you know the character-encoding of the incoming data you may want to write your own class that performs reading of binary data, looking for your specific end-of-line terminator. This may remove a lot of unnecessary encoding/decoding and copying. Make sure you implement something with a re-usable buffers (e.g. NIO's CharBuffer or ByteBuffer classes would come to mind, or a correctly initialized StringBuilder if you need String instances). Make sure you've got enough space in the buffer, 32Ki to 64Ki is nothing for current computers.

Once you've gotten the data in a usable container you can use any trick in the book (multiple threads, executors etc.) to handle the data efficiently. Remember, the only way to slow down a current CPU is to hit cache-misses - large/dynamic data sets, spurious copying - or branches - unnecessary loops, if statements and what's more and of course kernel calls and I/O.

0 讨论(0)
发布评论:

提交评论
- 加载中...
悲哀的现实

2021-02-13 22:43

The answer to your first question is yes and no. If the buffer already contains the line terminator it will return immediately, however if it does not contain the terminator then it will try to fill the buffer, but not necessarily fully. It will only read until there is some new data (at least one char) or EOF is reached.

One of the nice things about java is that the libraries are open source, so if you have a full copy of the JDK you can look at the source yourself to answer these types of questions. I use eclipse as my IDE and by default if you place the cursor over a class name and press F3 it will take you to the source (this is how I obtained the answer above). The caveat is with the standard distribution the source for some of the internal classes / native code is not available.

For your second question, I would say generally no, as the logic used by BufferedReader is the generally the same any code would need to recreate to achieve the same task. The only thing that might slow BufferedReader is internally it uses a StringBuffer, which is synchronized, instead of the unsynchronized StringBuilder.

0 讨论(0)
发布评论:

提交评论
- 加载中...
你的背包

2021-02-13 22:53

One of the advantages of the BufferedReader is that it provides a layer of separation (the buffer) between the input methods (read, readLine, etc.) you use and the actual socket reads, so you don't have to worry about all the cases like "most of the line is in the buffer, but you need to read another buffer to get the \n" etc.

Have you done performance measurement that indicates that using a BufferedReader is a performance issue for your application? If not, I would suggest that you start by choosing an input method which provides the functionality you want (line-based input terminated by \n's, from the sound of it), and worry about if there's a "faster" way to do it only if you find the input method is a bottleneck.

If line-based input is really what you're after, you're going to end up using some kind of buffer like BufferedReader does, so why re-invent this wheel?

0 讨论(0)
发布评论:

提交评论
- 加载中...