问题
I wrote a code to read *.las
file in Python. *las
file are special ascii file where each line is x,y,z
value of points.
My function read N
. number of points and check if they are inside a polygon with points_inside_poly
.
I have the following questions:
- When I arrive at the end of the file I get this message:
LASException: LASError in "LASReader_GetPointAt": point subscript out of range
because the number of points is under the chunk dimension. I cannot figure how to resolve this problem. a = [file_out.write(c[m]) for m in xrange(len(c))]
I usea =
in order to avoid video print. Is it correct?- In
c = [chunk[l] for l in index]
I create a new listc
because I am not sure that replacing a new chunk is the smart solution (ex:chunk = [chunk[l] for l in index]
). - In a statement
if else...else
I usepass
. Is this the right choice?
Really thank for help. It's important to improve listen suggestions from expertise!!!!
import shapefile
import numpy
import numpy as np
from numpy import nonzero
from liblas import file as lasfile
from shapely.geometry import Polygon
from matplotlib.nxutils import points_inside_poly
# open shapefile (polygon)
sf = shapefile.Reader(poly)
shapes = sf.shapes()
# extract vertices
verts = np.array(shapes[0].points,float)
# open las file
f = lasfile.File(inFile,None,'r') # open LAS
# read "header"
h = f.header
# create a file where store the points
file_out = lasfile.File(outFile,mode='w',header= h)
chunkSize = 100000
for i in xrange(0,len(f), chunkSize):
chunk = f[i:i+chunkSize]
x,y = [],[]
# extraxt x and y value for each points
for p in xrange(len(chunk)):
x.append(chunk[p].x)
y.append(chunk[p].y)
# zip all points
points = np.array(zip(x,y))
# create an index where are present the points inside the polygon
index = nonzero(points_inside_poly(points, verts))[0]
# if index is not empty do this otherwise "pass"
if len(index) != 0:
c = [chunk[l] for l in index] #Is It correct to create a new list or i can replace chunck?
# save points
a = [file_out.write(c[m]) for m in xrange(len(c))] #use a = in order to avoid video print. Is it correct?
else:
pass #Is It correct to use pass?
f.close()
file_out.close()
code proposed by @Roland Smith and changed by Gianni
f = lasfile.File(inFile,None,'r') # open LAS
h = f.header
# change the software id to libLAS
h.software_id = "Gianni"
file_out = lasfile.File(outFile,mode='w',header= h)
f.close()
sf = shapefile.Reader(poly) #open shpfile
shapes = sf.shapes()
for i in xrange(len(shapes)):
verts = np.array(shapes[i].points,float)
inside_points = [p for p in lasfile.File(inFile,None,'r') if pnpoly(p.x, p.y, verts)]
for p in inside_points:
file_out.write(p)
f.close()
file_out.close()
i used these solution: 1) reading f = lasfile.File(inFile,None,'r') and after the read head because i need in the *.las output file 2) close the file 3) i used inside_points = [p for p in lasfile.File(inFile,None,'r') if pnpoly(p.x, p.y, verts)] instead of
with lasfile.File(inFile, None, 'r') as f:
... inside_points = [p for p in f if pnpoly(p.x, p.y, verts)]
...
because i always get this error message
Traceback (most recent call last): File "", line 1, in AttributeError: _exit_
回答1:
Regarding (1):
First, why are you using chunks? Just use the lasfile as an iterator (as shown in the tutorial), and process the points one at a time. The following should get write all the points inside the polygon to the output file, by using the pnpoly function in a list comprehension instead of points_inside_poly
.
from liblas import file as lasfile
import numpy as np
from matplotlib.nxutils import pnpoly
with lasfile.File(inFile, None, 'r') as f:
inside_points = (p for p in f if pnpoly(p.x, p.y, verts))
with lasfile.File(outFile,mode='w',header= h) as file_out:
for p in inside_points:
file_out.write(p)
The five lines directly above should replace the whole big for
-loop. Let's go over them one-by-one:
with lasfile.File(inFile
...: Using this construction means that the file will be closed automatically when thewith
block finishes.- Now comes the good part, the generator expression that does all the work (the part between
()
). It iterates over the input file (for p in f
). Every point that is inside the polygon (if pnpoly(p.x, p.y, verts)
) is added to the generator. - We use another
with
block for the output file - and all the points (
for p in inside_points
, this is were the generator is used) - are written to the output file (
file_out.write(p)
)
Because this method only adds the points that are inside the polygon to the list, you don't waste memory on points that you don't need!
You should only use chunks if the method shown above doesn't work. When using chunks you should handle the exception properly. E.g:
from liblas import LASException
chunkSize = 100000
for i in xrange(0,len(f), chunkSize):
try:
chunk = f[i:i+chunkSize]
except LASException:
rem = len(f)-i
chunk = f[i:i+rem]
Regarding (2): Sorry, but I fail to understand what you are trying to accomplish here. What do you mean by "video print"?
Regarding (3): since you are not using the original chunk anymore, you can re-use the name. Realize that in python a "variable" is just a nametag.
Regarding (4): you aren't using the else
, so leave it out completely.
来源:https://stackoverflow.com/questions/12769353/python-suggestions-to-improve-a-chunk-by-chunk-code-to-read-several-millions-of