Why use arrays instead of slices?

前端 未结 4 1867
温柔的废话
温柔的废话 2020-12-08 20:47

I have been reading up on Go, and got stumped thinking about this fundamental question.

In Go, it is quite clear that slices are more flexible, and can generally be

相关标签:
4条回答
  • 2020-12-08 21:31

    To supplement Stephen Weinberg's answer:

    So, what are some real examples of "planning the detailed layout of memory" or "help avoid allocation" that slices would be unsuited for?

    Here's an example for "planning the detailed layout of memory". There are many file formats. Usually a file format is like this: it starts with a "magic number" then follows an informational header whose structure is usually fixed. This header contains information about the content, for example in case of an image file it contains info like image size (width, height), pixel format, compression used, header size, image data offset and alike (basically describes the rest of the file and how to interpret / process it).

    If you want to implement a file format in Go, an easy and convenient way is to create a struct containing the header fields of the format. When you want to read a file of such format, you can use the binary.Read() method to read the whole header struct into a variable, and similarly when you want to write a file of that format, you can use binary.Write() to write the complete header in one step into the file (or wherever you send the data).

    The header might contain even tens or a hundred fields, you can still read/write it with just one method call.

    Now as you can feel, the "memory layout" of the header struct must match exactly the byte layout as it is saved (or should be saved) in the file if you want to do it all in one step.

    And where do arrays come into the picture?

    Many file formats are usually complex because they want to be general and so allowing a wide range of uses and functionality. And many times you don't want to implement / handle everything the format supports because either you don't care (because you just want to extract some info), or you don't have to because you have guarantees that the input will only use a subset or a fixed format (out of the many cases the file format fully supports).

    So what do you do if you have a header specification with many fields but you only need a few of them? You can define a struct which will contain the fields you need, and between the fields you can use arrays with the size of the fields you just don't care / don't need. This will ensure that you can still read the whole header with one function call, and the arrays will basically be the placeholder of the unused data in the file. You may also use the blank identifier as the field name in the header struct definition if you won't use the data.

    Theoretical example

    For an easy example, let's implement a format where the magic is "TGI" (Theoretical Go Image) and the header contains fields like this: 2 reserved words (16 bit each), 1 dword image width, 1 dword image height, now comes 15 "don't care" dwords then the image save time as 8-byte being nanoseconds since January 1, 1970 UTC.

    This can be modeled with a struct like this (magic number excluded):

    type TGIHeader struct {
        _        uint16 // Reserved
        _        uint16 // Reserved
        Width    uint32
        Height   uint32
        _        [15]uint32 // 15 "don't care" dwords
        SaveTime int64
    }
    

    To read a TGI file and print useful info:

    func ShowInfo(name string) error {
        f, err := os.Open(name)
        if err != nil {
            return err
        }
        defer f.Close()
    
        magic := make([]byte, 3)
        if _, err = f.Read(magic); err != nil {
            return err
        }
        if !bytes.Equal(magic, []byte("TGI")) {
            return errors.New("Not a TGI file")
        }
    
        th := TGIHeader{}
        if err = binary.Read(f, binary.LittleEndian, &th); err != nil {
            return err
        }
    
        fmt.Printf("%s is a TGI file,\n\timage size: %dx%d\n\tsaved at: %v",
            name, th.Width, th.Height, time.Unix(0, th.SaveTime))
    
        return nil
    }
    
    0 讨论(0)
  • 2020-12-08 21:49

    As said by Akavall, arrays are hashable. That means they can be used as a key to a map.

    They are also pass by value. Each time you pass it to a function or assign it to another variable it makes a complete copy of it.

    They can be serialized by encoding/binary.

    They also can be used to control memory layout. Since it is not a reference, when it is placed in a struct, it will allocate that much memory as part of the struct instead of putting the equivalent of a pointer there like a slice would.

    Bottom line, don't use an array unless you know what you are doing.


    Hashable/serializable are all nice to have, but I'm just not sure if they are indeed that compelling to have

    What would you do if you wanted to have a map of md5 hashes? Can't use a byte slice so you would need to do something like this to get around the type system:

    // 16 bytes
    type hashableMd5 struct {a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p byte}
    

    Then create a serialization function for it. Hashable arrays mean that you can just call it a [16]byte.

    Sounds like getting closer to C's malloc, sizeof

    Nope, that has nothing to do with malloc or sizeof. Those are to allocate memory and get the size of a variable.

    However, CGo is another use case for this. The cgo command creates types that have the same memory layout as their corresponding C types. To do this, it sometimes needs to insert unnamed arrays for padding.

    If problems can be solved with ... nil/insignificant performance penalty using slices ...

    Arrays also prevent indirects making certain types of code faster. Of course this is such a minor optimization that this is insignificant in nearly all cases.

    0 讨论(0)
  • 2020-12-08 21:49

    To expand on this

    Arrays are useful when planning the detailed layout of memory and sometimes can help avoid allocation, but primarily they are a building block for slices.

    Arrays can be more efficient when considering the overhead of heap allocation. Think about the garbage collector, heap management and fragmentation, etc.

    For example if you have a local array variable like var x [8]int that is not used after the function returns, most probably it will be allocated on the stack. And stack allocation is much cheaper than heap allocation.

    Also for nested structures like arrays of arrays or arrays inside structs, it is cheaper to allocate them in one blob instead of in several pieces.

    So, use arrays for relatively short sequences of fixed size, e.g. an IP address.

    0 讨论(0)
  • 2020-12-08 21:51

    One practical difference is that arrays are hashable, while slices are not.

    0 讨论(0)
提交回复
热议问题