What is the Advantage of sync.WaitGroup over Channels?

后端未结

关注

 6  727

I\'m working on a concurrent Go library, and I stumbled upon two distinct patterns of synchronization between goroutines whose results are similar:

Waitgroup

相关标签:

6条回答

孤城傲影

2021-01-30 10:52

A WaitGroup's main advantage is simplicity.
Channels can be buffered or unbuffered and carrying a message or just a signal (without a message - empty channel), so there are many different use cases, and

"A WaitGroup waits for a collection of goroutines to finish. The main goroutine calls Add to set the number of goroutines to wait for. Then each of the goroutines runs and calls Done when finished. At the same time, Wait can be used to block until all goroutines have finished."

Let's do a benchmark:
TLDR:
Using sync.WaitGroup in the same code (same manner with done channel) is a little (as 9%) faster than buffered done channel (for the following benchmark):
695 ns/op vs 758 ns/op.
For unbuffered done channel, using sync.WaitGroupis faster (2x or more) - due to unbuffered channel synchronisation (for the following benchmark):
722 ns/op vs 2343 ns/op.

Benchmarks (using go version go1.14.7 linux/amd64):

Using a buffered done channel:

var done = make(chan struct{}, 1_000_000)

With benchtime command:

go test -benchtime=1000000x -benchmem -bench .
# BenchmarkEvenWaitgroup-8         1000000               695 ns/op               4 B/op          0 allocs/op
# BenchmarkEvenChannel-8           1000000               758 ns/op              50 B/op          0 allocs/op

Using a unbuffered done channel:

var done = make(chan struct{})

With this command:

go test -benchtime=1000000x -benchmem -bench .
# BenchmarkEvenWaitgroup-8         1000000               722 ns/op               4 B/op          0 allocs/op
# BenchmarkEvenChannel-8           1000000              2343 ns/op             520 B/op          1 allocs/op

Code:

package main

import (
    "sync"
)

func main() {
    evenWaitgroup(8)
}

func waitgroup(n int) {
    select {
    case ch <- n: // tx if channel is empty
    case i := <-ch: // rx if channel is not empty
        // fmt.Println(n, i)
        _ = i
    }
    wg.Done()
}

func evenWaitgroup(n int) {
    if n%2 == 1 { // must be even
        n++
    }
    for i := 0; i < n; i++ {
        wg.Add(1)
        go waitgroup(i)
    }
    wg.Wait()
}

func channel(n int) {
    select {
    case ch <- n: // tx if channel is empty
    case i := <-ch: // rx if channel is not empty
        // fmt.Println(n, i)
        _ = i
    }
    done <- struct{}{}
}

func evenChannel(n int) {
    if n%2 == 1 { // must be even
        n++
    }
    for i := 0; i < n; i++ {
        go channel(i)
    }
    for i := 0; i < n; i++ {
        <-done
    }
}

var wg sync.WaitGroup
var ch = make(chan int)
var done = make(chan struct{}, 1000000)

// var done = make(chan struct{})

Note: Switch the comment for a buffered and unbuffered done channel benchmark:

var done = make(chan struct{}, 1000000)
// var done = make(chan struct{})

main_test.go file:

package main

import (
    "testing"
)

func BenchmarkEvenWaitgroup(b *testing.B) {
    evenWaitgroup(b.N)
}
func BenchmarkEvenChannel(b *testing.B) {
    evenChannel(b.N)
}

0 讨论(0)

粉色の甜心

2021-01-30 10:53

If you are particularly sticky about using only channels, then it needs to be done differently (if we use your example does, as @Not_a_Golfer points out, it'll produce incorrect results).

One way is to make a channel of type int. In the worker process send a number each time it completes the job (this can be the unique job id too, if you want you can track this in the receiver).

In the receiver main go routine (which will know the exact number of jobs submitted) - do a range loop over a channel, count on till the number of jobs submitted are not done, and break out of the loop when all jobs are completed. This is a good way if you want to track each of the jobs completion (and maybe do something if needed).

Here's the code for your reference. Decrementing totalJobsLeft will be safe as it'll ever be done only in the range loop of the channel!

//This is just an illustration of how to sync completion of multiple jobs using a channel
//A better way many a times might be to use wait groups

package main

import (
    "fmt"
    "math/rand"
    "time"
)

func main() {

    comChannel := make(chan int)
    words := []string{"foo", "bar", "baz"}

    totalJobsLeft := len(words)

    //We know how many jobs are being sent

    for j, word := range words {
        jobId := j + 1
        go func(word string, jobId int) {

            fmt.Println("Job ID:", jobId, "Word:", word)
            //Do some work here, maybe call functions that you need
            //For emulating this - Sleep for a random time upto 5 seconds
            randInt := rand.Intn(5)
            //fmt.Println("Got random number", randInt)
            time.Sleep(time.Duration(randInt) * time.Second)
            comChannel <- jobId
        }(word, jobId)
    }

    for j := range comChannel {
        fmt.Println("Got job ID", j)
        totalJobsLeft--
        fmt.Println("Total jobs left", totalJobsLeft)
        if totalJobsLeft == 0 {
            break
        }
    }
    fmt.Println("Closing communication channel. All jobs completed!")
    close(comChannel)

}

0 讨论(0)

囚心锁ツ

2021-01-30 11:03

I often use channels to collect error messages from goroutines that could produce an error. Here is a simple example:

func couldGoWrong() (err error) {
    errorChannel := make(chan error, 3)

    // start a go routine
    go func() (err error) {
        defer func() { errorChannel <- err }()

        for c := 0; c < 10; c++ {
            _, err = fmt.Println(c)
            if err != nil {
                return
            }
        }

        return
    }()

    // start another go routine
    go func() (err error) {
        defer func() { errorChannel <- err }()

        for c := 10; c < 100; c++ {
            _, err = fmt.Println(c)
            if err != nil {
                return
            }
        }

        return
    }()

    // start yet another go routine
    go func() (err error) {
        defer func() { errorChannel <- err }()

        for c := 100; c < 1000; c++ {
            _, err = fmt.Println(c)
            if err != nil {
                return
            }
        }

        return
    }()

    // synchronize go routines and collect errors here
    for c := 0; c < cap(errorChannel); c++ {
        err = <-errorChannel
        if err != nil {
            return
        }
    }

    return
}

0 讨论(0)

无人及你

2021-01-30 11:04

Also suggest to use waitgroup but still you want to do it with channel then below i mention a simple use of channel

package main

import (
    "fmt"
    "time"
)

func main() {
    c := make(chan string)
    words := []string{"foo", "bar", "baz"}

    go printWordrs(words, c)

    for j := range c {
        fmt.Println(j)
    }
}


func printWordrs(words []string, c chan string) {
    defer close(c)
    for _, word := range words {
        time.Sleep(1 * time.Second)
        c <- word
    }   
}

0 讨论(0)

日久生厌

2021-01-30 11:14

Independently of the correctness of your second example (as explained in the comments, you aren't doing what you think, but it's easily fixable), I tend to think that the first example is easier to grasp.

Now, I wouldn't even say that channels are more idiomatic. Channels being a signature feature of the Go language shouldn't mean that it is idiomatic to use them whenever possible. What is idiomatic in Go is to use the simplest and easiest to understand solution: here, the WaitGroup convey both the meaning (your main function is Waiting for workers to be done) and the mechanic (the workers notify when they are Done).

Unless you're in a very specific case, I don't recommend using the channel solution here.

0 讨论(0)
发布评论:

提交评论
- 加载中...
甜味超标

2021-01-30 11:14

It depends on the use case. If you are dispatching one-off jobs to be run in parallel without needing to know the results of each job, then you can use a WaitGroup. But if you need to collect the results from the goroutines then you should use a channel.

Since a channel works both ways, I almost always use a channel.

On another note, as pointed out in the comment your channel example isn't implemented correctly. You would need a separate channel to indicate there are no more jobs to do (one example is here). In your case, since you know the number of words in advance, you could just use one buffered channel and receive a fixed number of times to avoid declaring a close channel.

0 讨论(0)
发布评论:

提交评论
- 加载中...