问题
I have a JPG, BMP, or SVG image (see example below) and I need an algorithm to extract the vertices (X, Y) coordinates and the egdes (i.e., a list that indicates which vertices are connected). The Edges can be of the form of a boolean true/false for each vertex pair or simply a list of vertex pairs that are connected. Any ideas welcome.
For example, I would like a function (or series of functions) which input the image and output two lists:
Vertices:
Vertex 1: X = 1, Y = 2
Vertex 2: X = 3, Y = 5
Vertex 3: X = 3, Y = 7
...
Edges:
Edge 1: (Vertex 1, Vertex 3)
Edge 2: (Vertex 1, Vertex 4)
Edge 3: (Vertex 4, Vertex 10)
...
The vertex coordinate system can be in any coordinate system (e.g., pixels, based on SVG coordinates) or it can be some alternate user-defined coordinate system.
For example, I extracted the following coordinates (pixels) from the example image (left) and plotting them in Matlab (right).
So, for example, I can tell that the corner vertices are roughly: (10, 10), (290, 10), (290, 190), and (10, 190).
But I want an algorithm to automatically detect those coordinates and to also tell me that there is an edge between the top left vertex (10, 190) and the top right vertex (290, 190), etc. I also need to identify each of the vertices and edges for the internal blocks, etc.
As well, for more complicated diagrams, I need it to work as well. For example, I am able to extract the necessary pixels and produce the following Matlab plot:
Similarly to before, it is quite clear where the vertices "should be", however, due to the line thickness, there are many clusters of pixels that first need to be "smoothed out", etc. I'm unsure of how to go about doing this and automating the process of identifying vertices/edges.
Note 1: The method I'm using to get the pixel coordinates is basically:
- Convert to Black/White
- Scan each pixel to see if colour <= threshold, save (X,Y) if it's "black"
- Plot in Matlab
A rough algorithm which I'm thinking is:
- Apply "smoothing" to get a single line instead of pixel clusters
- "Loop" through pixels in different directions, when a significant slope change occurs, Identify it as a "vertex"
- After all vertices are identified, evaluate the line between each pair of vertices, if that line is mostly black, identify it as an edge
There are many issues with the above algorithm, so I was hoping others might have some better ideas or similar C# code, etc.
I would like the process to be as automated as possible.
Note 2: I can also convert the image to SVG format (already implemented). It is my understanding that the SVG format may lend itself very well to my application because it can more easily automate the process; however, I find the SVG structure quite confusing.
I have read through some literature online about SVG formats and I understand how it works, but I was wondering if there was some sort of already existing library or something that would allow me to very easily identify the vertices of the "path" in the SVG file, etc.
For example, one of the "paths" that I get from one SVG file is of the form:
<path d="M70 1810 c0 -91 3 -110 15 -110 12 0 15 17 15 95 l0 95 1405
0 1405 0 0 -410 0 -411 -87 3 -88 3 -1 35 c0 19 -1 124 -2 233 l-2 197
-70 0 -70 0 0 -320 0 -320 153 0 c83 0 162 3 175 6 l22 6 0 504 0 504
-1435 0 -1435 0 0 -110z m2647 -490 c1 -113 2 -217 2 -232 l1 -27 88 -3
87 -3 0 -70 0 -70 145 0 -145 0 -3 295 c-1 162 0 301 3 308 3 9 21 12
57 10 l53 -3 2 -205z"/>
I know this follows a Cubic Bezier Spline, but I was wondering if any already existing algorithms are out there to process the "path" code and extract the relevant coordinates, etc.
Thanks for your help!!
回答1:
SVG path parsing is not that hard (unless you have complex SVG which does not seem like the case)
find path
Path starts with
<path
tag and ends usually with/>
so find the path start/end and then work only with the string inside.find the
d="
That is the path string data (so you skip formating etc ...) the end of this is marked with
"
so again work only with string insideprocess the path string
- read single character (skip spaces)
depending the character read the right count of numbers and add entity to your vector representation for example:
M
means absolute move sox,y
follows socursor = (x,y);
m
means relative move sox,y
follows socursor+= (x,y);
L
means absolute line sox,y
follows soadd_line(cursor,(x,y)); cursor = (x,y);
l
means relative line sox,y
follows soadd_line(cursor,cursor+(x,y)); cursor += (x,y);
C
means absolute BEZIER cubic sox1,y1,x2,y2,x3,y3
follows soadd_cubic_BEZIER(cursor,(x1,y1),(x2,y2),(x3,y3)); cursor=(x3,y3)
- etc ... the commands
m,M,l,L,h,H,v,V,c,C,s,S,q,Q,t,T
are different only in number of points and type of curve/line z
means just that you add line fromcursor
at the end to the start point
if next string is starting with number handle it as last command and goto #2
goto #1
That is all. So all you need is just simple string parsing capable of reading numbers in mantissa/exponent form like -125.547e-99
and skiping spaces/tabs. You do not need to decode the whole SVG just paths.
As you can have many paths per SVG then after parsing first <path
find another parse it ... until no one is left. Sometimes the <path
contains transform matrix or even the owner tag
usualy <g
so there may be stacked some transformations but I think your export is simple without such things.
来源:https://stackoverflow.com/questions/33110913/get-vertices-edges-from-bmp-or-svg-c