So I'm working on a project where me and a buddy of mine scanned a room using the KINECTv2 and made a 3D model out of it. The goal is to make it possible to add 3d models of different kinds of furniture in real time. To that goal I'm trying out different plane-fitting algorithms in order to find wich one would work the fastest. Does anybody have any suggestions? So far I've only researched the usage of the basic RANSAC algorithm included in PCL.
Two common approaches for plane fitting are RANSAC and Hough. Here's one performance comparison:
As with many problems in computational geometry and image processing, rather than consider what is "fastest," consider what is optimal for your application in terms of performance, development effort, and cost. Searching for the fastest possible algorithm could lead you down a path of horrifying cost and complexity, whereas you may able to implement a data processing chain of relatively simpler algorithms that run just fast enough to present a smooth and pleasant experience to the user.
Long story short, I recommend starting with a Hough plane fit. Hough transform algorithms are relatively easy to write (once you grasp the basics), and tweaking parameters is intuitive.
https://en.wikipedia.org/wiki/Hough_transform
One of the reasons to write your own algorithm is that you'll be in a better position to understand what changes need to be made once (not if) you discover the point cloud data is noisier and less well behaved than one would like.
Achieving good speed is going to rely on a number of factors, including these:
- Point cloud pre-processing. Look for ways to break up the point cloud into chunks that can be processed more quickly.
- Parameterization. Once data is preprocessed, you can define narrower search bounds for your plane fit algorithm. For example, only try plane fits within a few degrees of vertical. You'll also need to choose parameters to find a balance between speed and quality of fit.
- Quality of the 3D data. That's a big topic by itself, and the sooner you can take a close look at the problems in the data the better.
- What "real time" means. Even for 3D graphics applications that involve user interaction, achieving real time based strictly on specs (updates at N frames/second) may be less important than presenting a smooth and simple interface.
- Multithreading and parallelism.
- 3D display. Another big topic.
Pre-processing. You don't need to fit planes of arbitrary size to arbitrary point clouds: instead, you need to fit walls and maybe floors and ceilings. For Hough algorithms this means you can limit the range over which parameters are tested, and hence speed up processing.
Rather than try to find all the plane fits for the complete, original point cloud, find ways to break up the point cloud into clusters of subclouds to which plane fit tests can be run more efficiently.
PCL can calculate surface normals for you. If you can identify clusters of surface normals pointed in roughly the same direction, and then try plane fits for individual clusters, you should be able to speed things up considerably.
Also, for your first pass you'll probably want to downsample your data and try fits on relatively fewer points. This is analogous to creating an "image pyramid" for 2D processing.
Octrees are nice, simple means of dividing up space for queries, collision tests, and so on. An octree divides a space into eight nodes or "octants." This can be imagined as cutting a cube into eight smaller cubes. Then each octant is further divided into eight octants, and so on. If an octant (a node) doesn't contain points, you don't need to divide it further.
https://en.wikipedia.org/wiki/Octree
Parameterization. The descriptions above should make it clear that if you can preprocess the data by simplifying and/or breaking up the original raw point cloud, then you'll be able to test much more narrowly defined searches that will run more quickly.
For that matter, you probably don't need high accuracy in plane fits. You can generate reasonably good fits, then tweak those fits to generate ceilings, walls, and floors at right angles to each other.
3D data quality. The Kinect v2 is a time-of-flight device with some inherent measurement accuracy problems. For example, if you take a single image of a flat wall and then check the depth values, you'll notice some non-planar goofiness in the corners of the image. If you take a look at the range (or standard deviation) of depth values at each (x,y) pixel over multiple images, then you'll also notice differences in noisiness between center pixels and edge pixels.
Once you perform a plane fit, generate a measure of the fit quality. This requires going back through the data to calculate point-to-plane distances for the points used for calculation. (To speed this up, only use every Nth point or randomly sample the points.) As you tinker with parameters you'll see the effects in terms of speed and fit quality.
Real time vs. Perceptibly smooth. If you only need the user to move furniture around in real time, it should be okay to spend longer generating the initial plane fits.
Multithreading/Parallelism To handle data input, plane fitting, and user interface you'll almost certain have to think hard about multithreading. To test algorithms you do work on the UI thread just to get started, but this is limiting.
An application like this begs for CUDA or OpenCL. For the 3D display you'll be using the graphics card anyway. Although you don't need to jump into GPU programming right away, it's helpful to keep in mind how algorithms could be parallelized.
3D display. Were you planning to use Direct3D or OpenGL for 3D display and interaction? Implementing software to allow the user to "add 3d models of different kinds of furniture in real time" suggests you'll have to rely on the graphics card.
If you can display the point cloud in a 3D view, maybe you don't even need plane fits. You might even get away with collision detection: if the 3D model of a chair bumps into a cluster of points (i.e. a wall), then you might just detect that collision rather than try to fit planes to define bounds. Octants and other space-dividing techniques will help speed up collision tests.
The company Matterport (http://matterport.com/) has developed something very much like what you're describing. If nothing else you can give their software a try and consider what might be improved/adapted for your needs.
I appreciate Rethunk's detailed comments and provide a variant of Local Hough Transform. But first, let me point out that there are a bunch of stackoverflow / stackexchange posts on plane detection or detection of intersecting planes. Some of those are:
- Fit a plane to a 3D point cloud in C++
- difference between plane segmentation and plane fitting
- Plane fitting in a 3d point cloud
- Fit a plane to 3D point cloud using RANSAC
- Fast plane fitting to many points
- https://math.stackexchange.com/questions/1657030/fit-plane-to-3d-data-using-least-squares
- Best fit plane for 3D data
The method I'll suggest is explained in detail in a publication at 3DV 2015:
Local Hough Transform for 3D Primitive Detection, Bertram Drost, Slobodan Ilic, IEEE 3D Vision 2015
The idea is based on selecting two oriented pairs of points. The orientations of these points are compared to decide whether the points jointly lie on a plane or not. The contributions of all such pairs of points are combined in a local voting space, where the plane is parameterized in a 0-dimensional voting space (an oriented point fully determines the plane). The technique is extendible to different primitives.
RANSAC is generally inferior to the Hough transform and yet the proposed method can be seen as a hybrid between a global voting scheme and RANSAC. While RANSAC selects multiple random points, enough to fit the target primitive, the proposed method selects only a single point, the reference point.
I also have another stackexchange post explaining how one could potentially develop a similar method for orthogonal planes.
来源:https://stackoverflow.com/questions/35726134/3d-plane-fitting-algorithms