How do I read in a large flat file

后端 未结 4 1325

I have a flat file that has 339276 line of text in it for a size of 62.1 MB. I am attempting to read in all the lines, parse them based on some conditions I have and then insert

相关标签:
4条回答
  • 2021-02-05 17:59
    package main
    
    import (
        "fmt"
        "os"
        "log"
        "bufio"
    )
    
    func main() {
        FileName := "assets/file.txt"
        file, err := os.Open(FileName)
        if err != nil {
            log.Fatal(err)
        }
        defer file.Close()
    
        scanner := bufio.NewScanner(file)
    
        for scanner.Scan() { 
            fmt.Println(scanner.Text()) 
    
        }
    }
    
    0 讨论(0)
  • 2021-02-05 18:02

    bufio.Scan() and bufio.Text() in a loop perfectly works for me on a files with much larger size, so I suppose you have lines exceeded buffer capacity. Then

    • check your line ending
    • and which Go version you use path, err :=r.ReadLine("\n") // 0x0A separator = newline? Looks like func (b *bufio.Reader) ReadLine() (line []byte, isPrefix bool, err error) has return value isPrefix specifically for your use case http://golang.org/pkg/bufio/#Reader.ReadLine
    0 讨论(0)
  • 2021-02-05 18:04

    It's not clear that it's necessary to read in all the lines before parsing them and inserting them into a database. Try to avoid that.

    You have a small file: "a flat file that has 339276 line of text in it for a size of 62.1 MB." For example,

    package main
    
    import (
        "bytes"
        "fmt"
        "io"
        "io/ioutil"
    )
    
    func readLines(filename string) ([]string, error) {
        var lines []string
        file, err := ioutil.ReadFile(filename)
        if err != nil {
            return lines, err
        }
        buf := bytes.NewBuffer(file)
        for {
            line, err := buf.ReadString('\n')
            if len(line) == 0 {
                if err != nil {
                    if err == io.EOF {
                        break
                    }
                    return lines, err
                }
            }
            lines = append(lines, line)
            if err != nil && err != io.EOF {
                return lines, err
            }
        }
        return lines, nil
    }
    
    func main() {
        // a flat file that has 339276 lines of text in it for a size of 62.1 MB
        filename := "flat.file"
        lines, err := readLines(filename)
        fmt.Println(len(lines))
        if err != nil {
            fmt.Println(err)
            return
        }
    }
    
    0 讨论(0)
  • 2021-02-05 18:07

    It seems to me this variant of readLines is shorter and faster than suggested peterSO

    func readLines(filename string) (map[int]string, error) {
        lines := make(map[int]string)
    
        data, err := ioutil.ReadFile(filename)
        if err != nil {
            return nil, err
        }
    
        for n, line := range strings.Split(string(data), "\n") {
            lines[n] = line
        }
    
        return lines, nil
    }
    
    0 讨论(0)
提交回复
热议问题