Decode large stream JSON

后端 未结 3 1815
野的像风
野的像风 2020-12-16 11:21

I have a massive JSON array stored in a file (\"file.json\") I need to iterate through the array and do some operation on each element.

err = json.Unmarshal(         


        
相关标签:
3条回答
  • 2020-12-16 11:41

    So, as commenters suggested, you could use the streaming API of "encoding/json" for reading one string at a time:

    r := ... // get some io.Reader (e.g. open the big array file)
    d := json.NewDecoder(r)
    // read "["
    d.Token()
    // read strings one by one
    for d.More() {
        s, _ := d.Token()
        // do something with s which is the newly read string
        fmt.Printf("read %q\n", s)
    }
    // (optionally) read "]"
    d.Token()
    

    Note that for simplicity I've left error handling out which needs to be implemented.

    0 讨论(0)
  • 2020-12-16 11:49

    you can also check jsparser library which has been tested with large json files to allow stream based parsing with minimum memory footprint.

    0 讨论(0)
  • 2020-12-16 11:54

    There is an example of this sort of thing here: https://golang.org/pkg/encoding/json/#example_Decoder_Decode_stream.

    package main
    
    import (
        "encoding/json"
        "fmt"
        "log"
        "strings"
    )
    
    func main() {
        const jsonStream = `
                    [
                        {"Name": "Ed", "Text": "Knock knock."},
                        {"Name": "Sam", "Text": "Who's there?"},
                        {"Name": "Ed", "Text": "Go fmt."},
                        {"Name": "Sam", "Text": "Go fmt who?"},
                        {"Name": "Ed", "Text": "Go fmt yourself!"}
                    ]
                `
        type Message struct {
            Name, Text string
        }
        dec := json.NewDecoder(strings.NewReader(jsonStream))
    
        // read open bracket
        t, err := dec.Token()
        if err != nil {
            log.Fatal(err)
        }
        fmt.Printf("%T: %v\n", t, t)
    
        // while the array contains values
        for dec.More() {
            var m Message
            // decode an array value (Message)
            err := dec.Decode(&m)
            if err != nil {
                log.Fatal(err)
            }
    
            fmt.Printf("%v: %v\n", m.Name, m.Text)
        }
    
        // read closing bracket
        t, err = dec.Token()
        if err != nil {
            log.Fatal(err)
        }
        fmt.Printf("%T: %v\n", t, t)
    
    }
    
    0 讨论(0)
提交回复
热议问题