How many threads does it take to make them a bad choice?

前端 未结 15 2548
猫巷女王i
猫巷女王i 2021-02-07 21:46

I have to write a not-so-large program in C++, using boost::thread.

The problem at hand, is to process a large (maybe thousands or tens of thousands. Hundreds and millon

15条回答
  •  臣服心动
    2021-02-07 22:05

    To elaborate it really depends on

    IO boundedness of the problem
        how big are the files
        how contiguous are the files
        in what order must they be processed
        can you determine the disk placement
    how much concurrency you can get in the "global structure insert"
        can you "silo" the data structure with a consolidation wrapper
    the actual CPU cost of the "global structure insert" 
    

    For example if your files reside on a 3 terabyte flash memory array then the solution is different than if they reside on a single disk (where if the "global structure insert" takes less that the read the problem is I/O bounded and you might just as well have a 2 stage pipe with 2 threads - the read stage feeding the insert stage.)

    But in both cases the architecture would probably be a vertical pipeline of 2 stages. n reading threads and m writing threads with n and m being determined by a "natural concurrency" for the stage.

    Creating a thread per file will probably lead to disk thrashing. Just like you tailor the number of threads of a CPU bound process to the naturally achievable CPU concurrency (and going above that creates context switching overhead AKA thrashing) the same is true on the I/O side - in a sense you can think of the disk thrashing as "context switching on the disk".

提交回复
热议问题