How to read a large file - line by line?

前端 未结 11 875
一整个雨季
一整个雨季 2020-11-21 11:44

I want to iterate over each line of an entire file. One way to do this is by reading the entire file, saving it to a list, then going over the line of interest. This method

11条回答
  •  别跟我提以往
    2020-11-21 12:11

    I would strongly recommend not using the default file loading as it is horrendously slow. You should look into the numpy functions and the IOpro functions (e.g. numpy.loadtxt()).

    http://docs.scipy.org/doc/numpy/user/basics.io.genfromtxt.html

    https://store.continuum.io/cshop/iopro/

    Then you can break your pairwise operation into chunks:

    import numpy as np
    import math
    
    lines_total = n    
    similarity = np.zeros(n,n)
    lines_per_chunk = m
    n_chunks = math.ceil(float(n)/m)
    for i in xrange(n_chunks):
        for j in xrange(n_chunks):
            chunk_i = (function of your choice to read lines i*lines_per_chunk to (i+1)*lines_per_chunk)
            chunk_j = (function of your choice to read lines j*lines_per_chunk to (j+1)*lines_per_chunk)
            similarity[i*lines_per_chunk:(i+1)*lines_per_chunk,
                       j*lines_per_chunk:(j+1)*lines_per_chunk] = fast_operation(chunk_i, chunk_j) 
    

    It's almost always much faster to load data in chunks and then do matrix operations on it than to do it element by element!!

提交回复
热议问题