Read multiple text files in parallel using Cuda

前端 未结 2 906
庸人自扰
庸人自扰 2021-01-26 12:29

I would like to search for a given string in multiple files in parallel using CUDA. I have planned to use pfac library to search for the given string. The problem with this is h

相关标签:
2条回答
  • 2021-01-26 13:01

    Yes, it's probably possible to get a speed-up using CUDA if you can reduce the impact of read latency/bandwidth. One way would be by performing multiple searches concurrently. I.e. If you can search for [needle1], .. [needle1000] in your large haystack then each thread could search haystack-pieces and store the hits. Some analysis of the throughput required per-comparisons is required to determine whether your search is likely to be improved by employing CUDA. This may be useful http://dl.acm.org/citation.cfm?id=1855600

    0 讨论(0)
  • 2021-01-26 13:07

    Doing your task in CUDA will not help much over doing the same thing in CPU.

    Assuming that your files are stored on a standard, magnetic HDD, the typical single-threaded CPU program would consume:

    1. About 5ms to find the sector where the file is stored and put it under the reading head.
    2. About 10ms to load 1MB file (assuming 100MB/s read speed) into RAM memory
    3. Less than 0.1ms to load 1MB data from RAM to CPU cache and process it using a linear search algorithm.

    That is 15.1ms for a single file. If you have 1000 files, it will take 15.1s to do the work.

    Now, if I give you super-powerful GPU with infinite memory bandwith, no latency, and infinite processor speed, you will be able to perform the task (3) with no time. However, HDD reads will still consume exactly the same time. GPU cannot parallelise the work of another, independent device. As a result, instead of spending 15.1s, you will now do it in 15.0s.

    The infinite GPU would give you a 0.6% speedup. A real GPU would be not even close to that!


    In more general case: If you consider using CUDA, ask yourself: is the actual computation the bottleneck of the problem?

    • If yes - continue searching for possible solutions in the CUDA world.
    • If no - CUDA cannot help you.

    If you deal with thousants of tiny files and you need to perform reads often, consider techniques that can "attack" your bottleneck. Some may include:

    • RAM buffering
    • Putting your hard drives in a RAID configuration
    • Getting an SSD

    there may be more options, I am not an expert in that area.

    0 讨论(0)
提交回复
热议问题