Should we synchronize on writing strings? Since string is immutable we will never get inconsistent state between write and read from the 2 different threads, right?
All functions defined on String type at language syntax level or in standard library return a new string instance. No one function mutate string in-place. Just follow this practice and you will be concurrently-safe.
Should we synchronize on writing strings? Since string is immutable we will never get inconsistent state between write and read from the 2 different threads, right?
That's the question. The answer is: synchronize on writing strings. It's clear that string
variables are mutable and string
contents are immutable, as I already explained earlier. To reiterate:
The Go compiler will enforce the immutability of string
contents. For example,
package main
func main() {
var s string = "abc"
s[0] = '0'
}
Output:
5:7: cannot assign to s[0]
The Go runtime Data Race Detector flags the inconsistent state of mutable string
variables updated from different goroutines. For example, writing to a string
variable,
package main
import "time"
var s string = "abc"
func main() {
go func() {
for {
s = "abc"
}
}()
go func() {
for {
s = "abc"
}
}()
time.Sleep(1 * time.Second)
}
Output:
$ go run -race racer.go
==================
WARNING: DATA RACE
Write at 0x00000050d360 by goroutine 6:
main.main.func2()
/home/peter/src/racer.go:15 +0x3a
Previous write at 0x00000050d360 by goroutine 5:
main.main.func1()
/home/peter/src/racer.go:10 +0x3a
Goroutine 6 (running) created at:
main.main()
/home/peter/src/racer.go:13 +0x5a
Goroutine 5 (running) created at:
main.main()
/home/peter/src/racer.go:8 +0x42
==================
Found 1 data race(s)
exit status 66
$
Original Post:
The Go Programming Language Specification
String types
A string type represents the set of string values. A string value is a (possibly empty) sequence of bytes. Strings are immutable: once created, it is impossible to change the contents of a string.
A string
variable is not immutable. It contains a string descriptor, a struct
.
type stringStruct struct {
str unsafe.Pointer
len int
}
For example,
package main
import "fmt"
func main() {
s := "abc"
fmt.Println(s)
s = "xyz"
fmt.Println(s)
}
Output:
abc
xyz
The string
contents are immutable. For example,
// error: cannot assign to s[0]
s[0] = '0'
You need synchronization for access to string variables.
string
values are immutable, but variables are not. Variables are–what their name say–variable, their values can be changed.
You don't need synchronization for accessing a string
value, that can't change. If a string
value is handed to you, that (the content of the string
) will always remain the same (usage of package unsafe
does not count).
You need synchronization when you want to access a variable of string
type from multiple goroutines concurrently, if at least one of the accesses is a write (a write that changes the value of the string
variable). This is true for variables of any type in Go, the string
type is not special in any way.
What does this mean in practice?
If you have a function that receives a string
value "hello"
, you can be sure the string
value will stay "hello"
no matter what. Consequently if you don't change the argument yourself (e.g. you don't assign a new value to it), it will always hold the string
value "hello"
.
As a counter-example, if your function receives a slice value []byte{1, 2, 3}
, you don't have the same guarantee, because slices are mutable. The caller also has the slice value (the slice header), else it couldn't pass it in the first place. And if the caller modifies the elements of the slice concurrently, since they share the same backing array, the slice that was handed to you will also see the changed data... with proper synchronization; because without synchronization this would be a data race (and hence undefined behavior).
See this example:
var sig = make(chan int)
func main() {
s := []byte{1, 2, 3}
go func() {
<-sig
s[0] = 100
sig <- 0
}()
sliceTest(s)
}
func sliceTest(s []byte) {
fmt.Println("First s =", s)
sig <- 0 // send signal to modify now
<-sig // Wait for modification to complete
fmt.Println("Second s =", s)
}
Output (try it on the Go Playground):
First s = [1 2 3]
Second s = [100 2 3]
Focus on sliceTest()
: it receives a slice, and it prints it. Then waits a little (gives a "go" to a concurrent goroutine to modify it, and waits for this modification to complete), and prints it again, and it has changed, yet sliceTest()
itself did not modify it.
Now if sliceTest()
would receive a string
argument instead, this could not happen.
See related / possible duplicate: Immutable string and pointer address