Are there any good (or at least interesting but flawed) analogs to regular expressions in two dimensions?
In one dimension I can write something like /aaac?(bc)*b?
You're essentially talking about a spatial query language. There are plenty out there if you look up spatial query, geographic query and graphic querying. The spatial part generally comes down to points, lines and objects in a region that have other given attributes. Regions can be specified as polygons, distance from a point (e.g. circles), distance from a linear feature such as a road, all points on one side of a linear feature, etc... You can then get into more complex queries like set of all nearest neighbours, shortest path, travelling salesman, and tesselations like Delaunay TINs and Voronoi diagrams.
Nice problem.
First, ask yourself if you can constrain the pattern to a "+" pattern, or if you would it need to test/match rectangles also. For instance, a rectangle of [bc]
with a
border of a
would match the center rectangle below, but would also match a "+" shape of [c([bc]*a})v([bc]*a)>([bc]*a)<([bc]*a)]
(in your syntax)
xxxaaaaaxxx
yzyabcba12d
defabcbass3
oreabcba3s3
s33aaaaas33
k388x.egiee
If you can constrain it to a "+" then your task is much easier. As ShuggyCoUk said, parsing a RE is usually equivalent to a DFSM -- but for a single, serial input which simplifies things greatly.
In your "RE+" engine, you'll have to debug not only the engine, but also each place that the expressions are entered. I'd like the compiler to catch any errors it could. Given that, you could also use something that was explicitly four RE's, like:
REPlus engine = new REPlus('b').North("a{3}")
.East("a{3}").South("a{3}").West("a{3}");
However, depending on your implementation this may be cumbersome.
And with regard to the traversal engine -- would the North/West patterns match RtL or LtR? It could matter if the patterns match differently with or w/o greedy sub-matches.
Incidentally, I think the '^' in your example is supposed to be one character to the left, outside the parenthesis.
Regular expressions are designed to model patterns in one dimension. As I understand it, you want to match patterns in a two dimensional array of characters.
Can you use regular expressions? That depends on whether the property that you are searching for is decomposable into components which can be matched in one dimension or not. If so, you can run your regular expressions on columns and rows, and look for intersections in the solution sets from those. Depending on the particular problem you have, this may either solve the problem, or it may find areas in your 2d array which are likely to be solutions.
Whether your problem is decomposable or not, I think writing some custom code will be inevitable. But at least it sounds like an interesting problem, so the work should be pleasant.
Not being a regex expert, but finding the problem interesting, I looked around and found this interesting blog entry. Especially the syntax used there for defining the 2D regex looks appealing. The paper linked there might tell you more than me.
Update from comment: Here is the link to the primary author's page where you can download the linked paper "Two-dimensional languages": http://www.mat.uniroma2.it/~giammarr/Research/pubbl.html