I recently asked this question and the answers increased my understanding, but they didn't solve the actual problem I had. So, I will try to ask a similar but different question as follows.
Suppose that I want to access random rune
element of a string
. One way is:
func RuneElement(str string, idx int) rune {
var ret rune
for i, c := range str {
if i == idx {
return c
}
}
return ret // out of range -> proper handling is needed
}
What if I want to call such a function a lot of times? I guess what I am looking for is like an operator/function like str[i]
(which returns a byte
) that return the rune
element at i
-th position. Why this element can be accessed using for ... range
but not through a funtcion like str.At(i)
for example?
string
values in Go store the UTF-8 encoded byte sequence of the text. This is a design decision that has been made and it won't change.
If you want to efficiently get a rune
from it at an arbitrary index, you have to decode the bytes, you can't do anything about that (the for ... range
does this decoding). There is no "shortcut". The chosen representation just doesn't provide this out of the box.
If you have to do this frequently / many times, you should change your input and not use string
but a []rune
, as it's a slice and can be efficiently indexed. string
in Go is not []rune
. string
in Go is effectively a read-only []byte
(UTF-8). Period.
If you can't change the input type, you may build an internal cache mapped from string
to its []rune
:
var cache = map[string][]rune{}
func RuneAt(s string, idx int) rune {
rs := cache[s]
if rs == nil {
rs = []rune(s)
cache[s] = []rune(s)
}
if idx >= len(rs) {
return 0
}
return rs[idx]
}
It depends on case whether this is worth it: if RuneAt()
is called with a small set of string
s, this may improve performance a lot. If the passed strings are more-or-less unique, this will result in worse performance and a lot of memory usage. Also this implementation is not safe for concurrent use.
来源:https://stackoverflow.com/questions/44527223/access-random-rune-element-of-string-without-using-for-range