Reading a file concurrently

前端 未结 2 1054
轻奢々
轻奢々 2021-01-31 05:36

The reading part isn\'t concurrent but the processing is. I phrased the title this way because I\'m most likely to search for this problem again using that phrase. :)

相关标签:
2条回答
  • 2021-01-31 05:46

    You're almost there, just need a little bit of work on goroutines' synchronisation. Your problem is that you're trying to feed the parser and collect the results in the same routine, but that can't be done.

    I propose the following:

    1. Run scanner in a separate routine, close input channel once everything is read.
    2. Run separate routine waiting for the parsers to finish their job, than close the output channel.
    3. Collect all the results in you main routine.

    The relevant changes could look like this:

    // Go over a file line by line and queue up a ton of work
    go func() {
        scanner := bufio.NewScanner(file)
        for scanner.Scan() {
            jobs <- scanner.Text()
        }
        close(jobs)
    }()
    
    // Collect all the results...
    // First, make sure we close the result channel when everything was processed
    go func() {
        wg.Wait()
        close(results)
    }()
    
    // Now, add up the results from the results channel until closed
    counts := 0
    for v := range results {
        counts += v
    }
    

    Fully working example on the playground: http://play.golang.org/p/coja1_w-fY

    Worth adding you don't necessarily need the WaitGroup to achieve the same, all you need to know is when to stop receiving results. This could be achieved for example by scanner advertising (on a channel) how many lines were read and then the collector reading only specified number of results (you would need to send zeros as well though).

    0 讨论(0)
  • 2021-01-31 06:07

    Edit: The answer by @tomasz above is the correct one. Please disregard this answer.

    You need to do two things:

    1. use buffered chan's so that sending doesn't block
    2. close the results chan so that receiving doesn't block.

    The use of buffered channels is essential because unbuffered channels need a receive for each send, which is causing the deadlock you're hitting.

    If you fix that, you'll run into a deadlock when you try to receive the results, because results hasn't been closed.

    Here's the fixed playground: http://play.golang.org/p/DtS8Matgi5

    0 讨论(0)
提交回复
热议问题