Speed up python's struct.unpack

问题

I am trying to speed up my script. It basically reads a pcap file with Velodyne's Lidar HDL-32 information and allows me to get X, Y, Z, and Intensity values. I have profiled my script using python -m cProfile ./spTestPcapToLas.py and it is spending the most amount of time in my readDataPacket() function calls. In a small test (80 MB file) the unpacking portion takes around 56% of the execution time.

I call the readDataPacket function like this (chunk refers to the pcap file):

packets = []
for packet in chunk:
    memoryView = memoryview(packet.raw())
    udpDestinationPort = unpack('!h', memoryView[36:38].tobytes())[0]

    if udpDestinationPort == 2368:
        packets += readDataPacket(memoryView)

The readDataPacket() function itself is defined like this:

def readDataPacket(memoryView):
    firingData = memoryView[42:]    
    firingDataStartingByte = 0    
    laserBlock = []

    for i in xrange(firingBlocks):
        rotational = unpack('<H', firingData[firingDataStartingByte+2:firingDataStartingByte+4])[0]        
        startingByte = firingDataStartingByte+4
        laser = []
        for j in xrange(lasers):   
            distanceInformation = unpack('<H', firingData[startingByte:(startingByte + 2)])[0] * 0.002
            intensity = unpack('<B', firingData[(startingByte + 2)])[0]   
            laser.append([distanceInformation, intensity])
            startingByte += 3
        firingDataStartingByte += 100
        laserBlock.append([rotational, laser])

    return laserBlock

Any ideas on how I can speed up the process? By the way, I am using numpy for the X, Y, Z, Intensity calculations.

回答1:

Numpy lets you do this very quickly. In this case I think the easiest way is to use the ndarray constructor directly:

import numpy as np

def with_numpy(buffer):
    # Construct ndarray with: shape, dtype, buffer, offset, strides.
    rotational = np.ndarray((firingBlocks,), '<H', buffer, 42+2, (100,))
    distance = np.ndarray((firingBlocks,lasers), '<H', buffer, 42+4, (100,3))
    intensity = np.ndarray((firingBlocks,lasers), '<B', buffer, 42+6, (100,3))
    return rotational, distance*0.002, intensity

This returns separate arrays instead of the nested list, which should be much easier to process further. As input it takes a buffer object (in Python 2) or anything that exposes the buffer interface. Unfortunately, it depends on your Python version (2/3) what objects you can use exactly. But this method is very fast:

import numpy as np

firingBlocks = 10**4
lasers = 32
packet_raw = np.random.bytes(42 + firingBlocks*100)

%timeit readDataPacket(memoryview(packet_raw))
# 1 loop, best of 3: 807 ms per loop
%timeit with_numpy(packet_raw)
# 100 loops, best of 3: 10.8 ms per loop

回答2:

Compile a Struct ahead of time, to avoid the Python level wrapping code using the module level methods. Do it outside the loops, so the construction cost is not paid repeatedly.

unpack_ushort = struct.Struct('<H').unpack
unpack_ushort_byte = struct.Struct('<HB').unpack

The Struct methods themselves are implemented in C in CPython (and the module level methods are eventually delegating to the same work after parsing the format string), so building the Struct once and storing bound methods saves a non-trivial amount of work, particularly when unpacking a small number of values.

You can also save some work by unpacking multiple values together, rather than one at a time:

distanceInformation, intensity = unpack_ushort_byte(firingData[startingByte:startingByte + 3])
distanceInformation *= 0.002

As Dan notes, you could further improve this with iter_unpack, which would further reduce the amount of byte code execution and small slice operations.

回答3:

You can unpack the raw distanceInformation and intensity values together in one call. Especially because you're just putting them into a list together: that's what unpack() does when it unpacks multiple values. In your case, you need to then multiple the distanceInformation by 0.002, but you might save time by leaving this until later, because you can use iter_unpack() to parse the whole list of raw pairs in one call. That function gives you a generator, which can be sliced with itertools.islice() and then turned into a list. Something like this:

laser_iter = struct.iter_unpack('<HB', firingData[firingDataStartingByte + 4])
laser = [[d * 0.002, i] for d, i in itertools.islice(laser_iter, lasers)]

Unfortunately this is a little harder to read, so you might want to find a way to spread this out into more lines of code, with more descriptive variable names, or add a comment for the future when you forget why you wrote this…

来源：https://stackoverflow.com/questions/36797088/speed-up-pythons-struct-unpack

标签

python

performance

numpy

unpack

lidar