What is the optimal number of threads for performing IO operations in java?

前端 未结 7 1883
陌清茗
陌清茗 2021-01-01 17:54

In Goetz\'s \"Java Concurrency in Practice\", in a footnote on page 101, he writes \"For computational problems like this that do not I/O and access no shared data, Ncpu or

相关标签:
7条回答
  • 2021-01-01 18:02

    Yes, 20 threads can definitely write to disk faster than 4 threads on a 4 CPU machine. Many real programs are I/O bound more than CPU bound. However, it depends in great detail on your disks and how much CPU work your other threads are doing before they, too, end up waiting on those disks.

    If all of your threads are solely writing to disk and doing nothing else, then it may well be that 1 thread on a 4 CPU machine is actually the fastest way to write to disk. It depends entirely on how many disks you have, how much data you're writing, and how good your OS is at I/O scheduling. Your specific question suggests you want 4 threads all writing to the same file. That doesn't make much sense, and in any practical scenario I can't think how that'd be faster. (You'd have to allocate the file ahead of time, then each thread would seek() to a different position, and you'd end up just thrashing the write head as each thread tried to write some blocks.)

    The advantage of multithreading is much simpler when you're network bound. Ie: waiting on a database server, or a web browser, or the like. There you're waiting on multiple external resources.

    0 讨论(0)
  • 2021-01-01 18:03

    If you are using synchronous I/O, then you should have one thread for every simultaneous I/O request your machine can handle. In the case of a single spindle single hard disk, that's 1 (you can either read or write but not both simultaneuosly). For a disk that can handle many I/O requests simultaneously, that would be however many requests it can handle simultaneously.

    In other words, this is not bounded by the CPU count, as I/O does not really hit the CPU beyond submitting requests and waiting. See here for a better explanation.

    There's a whole other can of worms with how many I/O requests you should have in flight at any given time.

    0 讨论(0)
  • 2021-01-01 18:10

    Like all performance related things it depends.

    If you're I/O bound, then adding threads won't help you at all. (Ok, as Steven Sudit points out, you might get an increase in performance, but it'll be small) If you're not I/O bound then adding threads may help

    Not trying to be smart, but the best way to find out is to profile it and see what works for your particular circumstances.

    Edit: Updated based on comments

    0 讨论(0)
  • 2021-01-01 18:12

    If the only thing you do with that threads is writing to the disk then your performance increase will be negligible or even harmful as usually drivers are optimized for sequential reads for hard drives so that you're transforming a sequential write in a file to several "random" writes.

    Multithreading can only help you with I/O bound problems if the I/O is perform against different disks, different network cards or different database servers in terms of performance. Nontheless in terms of observed performance the difference can be much greater.

    For example, imagine you're sending several files to a lot of different receivers through a network. You're still network bound so that your maximum speed won't be higher than say 100Mb/S but, if you use 20 threads then the process will be much more fair.

    0 讨论(0)
  • 2021-01-01 18:17

    See also Will using multiple threads with a RandomAccessFile help performance?

    UPDATE: I added a benchmark there.

    0 讨论(0)
  • 2021-01-01 18:19

    In practice, I/O-bound applications can still benefit substantially from multithreading because it can be much faster to read or write a few files in parallel than sequentially. This is particularly the case where overall throughput is compromised by network latency. But it's also the case that one thread can be processing the last thing that it read while another thread is busy reading, allowing higher CPU utilization.

    We can talk theory all day, but the right answer is to make the number of threads configurable. I think you'll find that increasing it past 1 will boost your speed, but there will also come a point of diminishing returns.

    0 讨论(0)
提交回复
热议问题