Python: suggestions to improve a chunk-by-chunk code to read several millions of points

前端 未结 1 1556
一向
一向 2021-01-06 18:17

I wrote a code to read *.las file in Python. *las file are special ascii file where each line is x,y,z value of points.

My fun

相关标签:
1条回答
  • 2021-01-06 19:04

    Regarding (1):

    First, why are you using chunks? Just use the lasfile as an iterator (as shown in the tutorial), and process the points one at a time. The following should get write all the points inside the polygon to the output file, by using the pnpoly function in a list comprehension instead of points_inside_poly.

    from liblas import file as lasfile
    import numpy as np
    from matplotlib.nxutils import pnpoly
    
    with lasfile.File(inFile, None, 'r') as f:
        inside_points = (p for p in f if pnpoly(p.x, p.y, verts))
        with lasfile.File(outFile,mode='w',header= h) as file_out:
            for p in inside_points:
                file_out.write(p)
    

    The five lines directly above should replace the whole big for-loop. Let's go over them one-by-one:

    • with lasfile.File(inFile...: Using this construction means that the file will be closed automatically when the with block finishes.
    • Now comes the good part, the generator expression that does all the work (the part between ()). It iterates over the input file (for p in f). Every point that is inside the polygon (if pnpoly(p.x, p.y, verts)) is added to the generator.
    • We use another with block for the output file
    • and all the points (for p in inside_points, this is were the generator is used)
    • are written to the output file (file_out.write(p))

    Because this method only adds the points that are inside the polygon to the list, you don't waste memory on points that you don't need!

    You should only use chunks if the method shown above doesn't work. When using chunks you should handle the exception properly. E.g:

    from liblas import LASException
    
    chunkSize = 100000
    for i in xrange(0,len(f), chunkSize):
        try:
            chunk = f[i:i+chunkSize]
        except LASException:
            rem = len(f)-i
            chunk = f[i:i+rem]
    

    Regarding (2): Sorry, but I fail to understand what you are trying to accomplish here. What do you mean by "video print"?

    Regarding (3): since you are not using the original chunk anymore, you can re-use the name. Realize that in python a "variable" is just a nametag.

    Regarding (4): you aren't using the else, so leave it out completely.

    0 讨论(0)
提交回复
热议问题