Note: I am placing this question in both the MATLAB and Python tags as I am the most proficient in these languages. However, I welcome solutions in any language.
While I can not suggest a better approach to group any given list of centroid points than the one you already tried, I hope the following idea might help you out:
Since you are very specific about the content of your image (containing a field of squares) I was wondering if you in fact need to group the centroid points from the data given in your problem setup
, or if you can use the data described in Background to the problem
as well. Since you already determined the corners of each detected square as well as their position in that given square it seems to me like it would be very accurate to determine a neighbour of a given square by comparing the corner-coordinates.
So for finding any candidate for a right neighbour of any square, i would suggest you compare the upper right and lower right corner of that square with the upper left and lower left corner of any other square (or any square within a certain distance). Allowing for only small vertical differences and slightly greater horizontal differences, you can "match" two squares, if both their corresponding corner-points are close enough together.
By using an upper limit to the allowed vertical/horizontal difference between corners, you might even be able to just assign the best matching square within these boundaries as neighbour
A problem might be that you don't detect all the squares, so there is a rather large space between square 30 and 32. Since you said you need 'at least' 3 squares per row, it might be viable for you to simply ignore square 32 in that horizontal line. If that is not an option for you, you could try to match as many squares as possible and afterwards assign the "missing" squares to a point in your grid by using the previously calculated data:
In the example about square 32 you would've detected that it has upper and lower neighbours 27 and 37. Also you should've been able to determine that square 27 lies within row 1 and 37 lies within row 3, so you can assign square 32 to the "best matching" row in between, which is obviously 2 in this case.
This general approach is basically the approach you have tried already, but should hopefully be a lot more accurate since you are now comparing orientation and distance of two lines instead of simply comparing the location of two points in a grid.
Also as a sidenode on your previous attempts - can you use the black cornerlines to correct the initial rotation of your image a bit? This might make further distortion algorithms (like the ones that you discussed with knedlsepp in the comments) a lot more accurate. (EDIT: I did read the comments of Parag just now - comparing the points by the angle of the lines is of course basically the same as rotating the Image beforehand)
I'm using a cropped version of the posted image as the input. Here I'm only addressing the case where the orientation of the grid can be thought of as near horizontal/vertical. This may not fully address your scope, but I think it may give you some pointers.
Binarize the image so that the distorted squares are filled. Here I use a simple Otsu thresholding. Then take the distance transform of this binary image.
In the distance transformed image we see the gaps between the squares as peaks.
To get horizontally oriented lines, take the local maxima of each of the columns of the distance image and then find connected components.
To get vertically oriented lines, take the local maxima of each of the rows of the distance image and then find connected components.
Images below show the horizontal and vertical lines thus found with corner points as circles.
For reasonably long connected components, you can fit a curve (a line or a polynomial) and then classify the corner points, say based on the distance to the curve, on which side of the curve the point is, etc.
I did this in Matlab. I didn't try the curve fitting and classification parts.
clear all;
close all;
im = imread('0RqUd-1.jpg');
gr = rgb2gray(im);
% any preprocessing to get a binary image that fills the distorted squares
bw = ~im2bw(gr, graythresh(gr));
di = bwdist(bw); % distance transform
di2 = imdilate(di, ones(3)); % propagate max
corners = corner(gr); % simple corners
% find regional max for each column of dist image
regmxh = zeros(size(di2));
for c = 1:size(di2, 2)
regmxh(:, c) = imregionalmax(di2(:, c));
end
% label connected components
ccomph = bwlabel(regmxh, 8);
% find regional max for each row of dist image
regmxv = zeros(size(di2));
for r = 1:size(di2, 1)
regmxv(r, :) = imregionalmax(di2(r, :));
end
% label connected components
ccompv = bwlabel(regmxv, 8);
figure, imshow(gr, [])
hold on
plot(corners(:, 1), corners(:, 2), 'ro')
figure, imshow(di, [])
figure, imshow(label2rgb(ccomph), [])
hold on
plot(corners(:, 1), corners(:, 2), 'ro')
figure, imshow(label2rgb(ccompv), [])
hold on
plot(corners(:, 1), corners(:, 2), 'ro')
To get these lines for arbitrary oriented grids, you can think of the distance image as a graph and find optimal paths. See here for a nice graph based approach.
Note 1: It has a number of settings -> which for other images may need to altered to get the result you want see % Settings - play around with these values
Note 2: It doesn't find all of the lines you want -> but its a starting point....
To call this function, invoke this in the command prompt:
>> [h, v] = testLines;
We get:
>> celldisp(h)
h{1} =
1 2 4 6 9 10
h{2} =
3 5 7 8 11 14 15 17
h{3} =
1 2 4 6 9 10
h{4} =
3 5 7 8 11 14 15 17
h{5} =
1 2 4 6 9 10
h{6} =
3 5 7 8 11 14 15 17
h{7} =
3 5 7 8 11 14 15 17
h{8} =
1 2 4 6 9 10
h{9} =
1 2 4 6 9 10
h{10} =
12 13 16 18 20 22 25 27
h{11} =
13 16 18 20 22 25 27
h{12} =
3 5 7 8 11 14 15 17
h{13} =
3 5 7 8 11 14 15
h{14} =
12 13 16 18 20 22 25 27
h{15} =
3 5 7 8 11 14 15 17
h{16} =
12 13 16 18 20 22 25 27
h{17} =
19 21 24 28 30
h{18} =
21 24 28 30
h{19} =
12 13 16 18 20 22 25 27
h{20} =
19 21 24 28 30
h{21} =
12 13 16 18 20 22 24 25
h{22} =
12 13 16 18 20 22 24 25 27
h{23} =
23 26 29 31 33 34 35
h{24} =
23 26 29 31 33 34 35 37
h{25} =
23 26 29 31 33 34 35 36 37
h{26} =
33 34 35 37 36
h{27} =
31 33 34 35 37
>> celldisp(v)
v{1} =
33 28 18 8 1
v{2} =
34 30 20 11 2
v{3} =
26 19 12 3
v{4} =
35 22 14 4
v{5} =
29 21 13 5
v{6} =
25 15 6
v{7} =
31 24 16 7
v{8} =
37 32 27 17 9
A figure is also generated that draws the lines through each proper set of points:
function [horiz_list, vert_list] = testLines
global counter;
global colours;
close all;
data = [ 475. , 605.75, 1.;
571. , 586.5 , 2.;
233. , 558.5 , 3.;
669.5 , 562.75, 4.;
291.25, 546.25, 5.;
759. , 536.25, 6.;
362.5 , 531.5 , 7.;
448. , 513.5 , 8.;
834.5 , 510. , 9.;
897.25, 486. , 10.;
545.5 , 491.25, 11.;
214.5 , 481.25, 12.;
271.25, 463. , 13.;
646.5 , 466.75, 14.;
739. , 442.75, 15.;
340.5 , 441.5 , 16.;
817.75, 421.5 , 17.;
423.75, 417.75, 18.;
202.5 , 406. , 19.;
519.25, 392.25, 20.;
257.5 , 382. , 21.;
619.25, 368.5 , 22.;
148. , 359.75, 23.;
324.5 , 356. , 24.;
713. , 347.75, 25.;
195. , 335. , 26.;
793.5 , 332.5 , 27.;
403.75, 328. , 28.;
249.25, 308. , 29.;
495.5 , 300.75, 30.;
314. , 279. , 31.;
764.25, 249.5 , 32.;
389.5 , 249.5 , 33.;
475. , 221.5 , 34.;
565.75, 199. , 35.;
802.75, 173.75, 36.;
733. , 176.25, 37.];
figure; hold on;
axis ij;
% Change due to Benoit_11
scatter(data(:,1), data(:,2),40, 'r.'); text(data(:,1)+10, data(:,2)+10, num2str(data(:,3)));
text(data(:,1)+10, data(:,2)+10, num2str(data(:,3)));
% Process your data as above then run the function below(note it has sub functions)
counter = 0;
colours = 'bgrcmy';
[horiz_list, vert_list] = findClosestPoints ( data(:,1), data(:,2) );
function [horiz_list, vert_list] = findClosestPoints ( x, y )
% calc length of points
nX = length(x);
% set up place holder flags
modelledH = false(nX,1);
modelledV = false(nX,1);
horiz_list = {};
vert_list = {};
% loop for all points
for p=1:nX
% have we already modelled a horizontal line through these?
% second last param - true - horizontal, false - vertical
if modelledH(p)==false
[modelledH, index] = ModelPoints ( p, x, y, modelledH, true, true );
horiz_list = [horiz_list index];
else
[~, index] = ModelPoints ( p, x, y, modelledH, true, false );
horiz_list = [horiz_list index];
end
% make a temp copy of the x and y and remove any of the points modelled
% from the horizontal -> this is to avoid them being found in the
% second call.
tempX = x;
tempY = y;
tempX(index) = NaN;
tempY(index) = NaN;
tempX(p) = x(p);
tempY(p) = y(p);
% Have we found a vertial line?
if modelledV(p)==false
[modelledV, index] = ModelPoints ( p, tempX, tempY, modelledV, false, true );
vert_list = [vert_list index];
end
end
end
function [modelled, index] = ModelPoints ( p, x, y, modelled, method, fullRun )
% p - row in your original data matrix
% x - data(:,1)
% y - data(:,2)
% modelled - array of flags to whether rows have been modelled
% method - horizontal or vertical (used to calc graadients)
% fullRun - full calc or just to get indexes
% this could be made better by storing the indexes of each horizontal in the method above
% Settings - play around with these values
gradDelta = 0.2; % find points where gradient is less than this value
gradLimit = 0.45; % if mean gradient of line is above this ignore
numberOfPointsToCheck = 7; % number of points to check when look along the line
% to find other points (this reduces chance of it
% finding other points far away
% I optimised this for your example to be 7
% Try varying it and you will see how it effect the result.
% Find the index of points which are inline.
[index, grad] = CalcIndex ( x, y, p, gradDelta, method );
% check gradient of line
if abs(mean(grad))>gradLimit
index = [];
return
end
% add point of interest to index
index = [p index];
% loop through all points found above to find any other points which are in
% line with these points (this allows for slight curvature
combineIndex = [];
for ii=2:length(index)
% Find inedex of the points found above (find points on curve)
[index2] = CalcIndex ( x, y, index(ii), gradDelta, method, numberOfPointsToCheck, grad(ii-1) );
% Check that the point on this line are on the original (i.e. inline -> not at large angle
if any(ismember(index,index2))
% store points found
combineIndex = unique([index2 combineIndex]);
end
end
% copy to index
index = combineIndex;
if fullRun
% do some plotting
% TODO: here you would need to calculate your arrays to output.
xx = x(index);
[sX,sOrder] = sort(xx);
% Check its found at least 3 points
if length ( index(sOrder) ) > 2
% flag the modelled on the points found
modelled(index(sOrder)) = true;
% plot the data
plot ( x(index(sOrder)), y(index(sOrder)), colours(mod(counter,numel(colours)) + 1));
counter = counter + 1;
end
index = index(sOrder);
end
end
function [index, gradCheck] = CalcIndex ( x, y, p, gradLimit, method, nPoints2Consider, refGrad )
% x - data(:,1)
% y - data(:,2)
% p - point of interest
% method (x/y) or (y\x)
% nPoints2Consider - only look at N points (options)
% refgrad - rather than looking for gradient of closest point -> use this
% - reference gradient to find similar points (finds points on curve)
nX = length(x);
% calculate gradient
for g=1:nX
if method
grad(g) = (x(g)-x(p))\(y(g)-y(p));
else
grad(g) = (y(g)-y(p))\(x(g)-x(p));
end
end
% find distance to all other points
delta = sqrt ( (x-x(p)).^2 + (y-y(p)).^2 );
% set its self = NaN
delta(delta==min(delta)) = NaN;
% find the closest points
[m,order] = sort(delta);
if nargin == 7
% for finding along curve
% set any far away points to be NaN
grad(order(nPoints2Consider+1:end)) = NaN;
% find the closest points to the reference gradient within the allowable limit
index = find(abs(grad-refGrad)<gradLimit==1);
% store output
gradCheck = grad(index);
else
% find the points which are closes to the gradient of the closest point
index = find(abs(grad-grad(order(1)))<gradLimit==1);
% store gradients to output
gradCheck = grad(index);
end
end
end