可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I am perplexed by the API to scipy.ndimage.interpolation.affine_transform
. And judging by this issue I'm not the only one. I'm actually wanting to do more interesting things with affine_transform
than just rotating an image, but a rotation would do for starters. (And yes I'm well aware of scipy.ndimage.interpolation.rotate
, but figuring out how to drive affine_transform
is what interests me here).
When I want to do this sort of thing in systems like OpenGL, I'm think in terms of computing the transform which applies a 2x2 rotation matrix R
about a centre c
, and therefore thinking of points p
being transformed (p-c)R+c
= pR+c-cR
, which gives a c-cR
term to be used as the translation component of a transform. However, according to the issue above, scipy's affine_transform
does "offset first" so we actually need to compute an offset s
such that (p-c)R+c=(p+s)R
which with a bit of rearrangement gives s=(c-cR)R'
where R'
is the inverse of R
.
If I plug this into an ipython notebook (pylab mode; code below maybe needs some additional imports):
img=scipy.misc.lena() #imshow(img,cmap=cm.gray);show() centre=0.5*array(img.shape) a=15.0*pi/180.0 rot=array([[cos(a),sin(a)],[-sin(a),cos(a)]]) offset=(centre-centre.dot(rot)).dot(linalg.inv(rot)) rotimg=scipy.ndimage.interpolation.affine_transform( img,rot,order=2,offset=offset,cval=0.0,output=float32 ) imshow(rotimg,cmap=cm.gray);show()
I get
which unfortunately isn't rotated about the centre.
So what's the trick I'm missing here?
回答1:
Just doing some quick & dirty testing I noticed that taking the negative value of your offset seems to rotate about the centre.
回答2:
Once treddy's answer got me a working baseline, I managed to get a better working model of affine_transform
. It's not actually as odd as the issue linked in the original question hints.
Basically, each point (coordinate) p
in the output image is transformed to pT+s
where T
and s
are the matrix and offset passed to the function. So if we want point c_out
in the output to be mapped to and sampled from c_in
from the input image, with rotation R
and (possibly anisotropic) scaling S
we need pT+s = (p-c_out)RS+c_in
which can be rearranged to yield s = (c_int-c_out)T
(with T=RS
).
For some reason I then need to pass transform.T
to affine_transform
but I'm not going to worry about that too much; probably something to do with row-coordinates with transforms on the right (assumed above) vs column-coordinates with transforms on the left.
So here's a simple test rotating a centred image:
src=scipy.misc.lena() c_in=0.5*array(src.shape) c_out=array((256.0,256.0)) for i in xrange(0,7): a=i*15.0*pi/180.0 transform=array([[cos(a),-sin(a)],[sin(a),cos(a)]]) offset=c_in-c_out.dot(transform) dst=scipy.ndimage.interpolation.affine_transform( src,transform.T,order=2,offset=offset,output_shape=(512,512),cval=0.0,output=float32 ) subplot(1,7,i+1);axis('off');imshow(dst,cmap=cm.gray) show()
Here's it modified for different image sizes
src=scipy.misc.lena()[::2,::2] c_in=0.5*array(src.shape) c_out=array((256.0,256.0)) for i in xrange(0,7): a=i*15.0*pi/180.0 transform=array([[cos(a),-sin(a)],[sin(a),cos(a)]]) offset=c_in-c_out.dot(transform) dst=scipy.ndimage.interpolation.affine_transform( src,transform.T,order=2,offset=offset,output_shape=(512,512),cval=0.0,output=float32 ) subplot(1,7,i+1);axis('off');imshow(dst,cmap=cm.gray) show()
And here's a version with anisotropic scaling to compensate for the anisotropic resolution of the source image.
src=scipy.misc.lena()[::2,::4] c_in=0.5*array(src.shape) c_out=array((256.0,256.0)) for i in xrange(0,7): a=i*15.0*pi/180.0 transform=array([[cos(a),-sin(a)],[sin(a),cos(a)]]).dot(diag(([0.5,0.25]))) offset=c_in-c_out.dot(transform) dst=scipy.ndimage.interpolation.affine_transform( src,transform.T,order=2,offset=offset,output_shape=(512,512),cval=0.0,output=float32 ) subplot(1,7,i+1);axis('off');imshow(dst,cmap=cm.gray) show()
回答3:
Based on the insight from @timday that matrix
and offset
are defined in the output coordinate system, I would offer the following reading of the issue, which fits with standard notations in linear algebra and allows to understand the scaling of images as well. I use here T.inv=T^-1
as pseudo-python notation to mean the inverse of a matrix and *
to mean the dot product.
For each point o
in the output image, affine_transform
finds the corresponding point i
in the input image as i=T.inv*o+s
, where matrix=T.inv
is the inverse of the 2x2 transformation matrix that one would use to define the forward affine transformation and offset=s
is the translation defined in the output coordinates. For a pure rotation T=R=[[cos,-sin],[sin,cos]]
, and in this special case matrix=T.inv=T.T
, which is the reason why @timday had to apply the transposition still (alternatively one could just use the negative angle).
The value for the offset s
is found exactly the way described by @timday: if c_in
is supposed to be positioned, after the affine transformation, at c_out
(e.g. the input centre should be placed at the output centre) then c_in=T.inv*c_out+s
or s=c_in-T.inv*c_out
(note the conventional mathematical order of the matrix product used here, matrix*vector, which is why @timday, who used the revers order, didn't need a transposition at this point in his code).
If one wants a scaling S
first and then a rotation R
it holds that T=R*S
and therefore T.inv=S.inv*R.inv
(note the reversed order). For example, if one wants to make the image double as wide in the columns direction ('x'), then S=diag((1, 2))
, hence S.inv=diag((1, 0.5))
.
src = scipy.misc.lena() c_in = 0.5 * array(src.shape) dest_shape = (512, 1028) c_out = 0.5 * array(dest_shape) for i in xrange(0, 7): a = i * 15.0 * pi / 180.0 rot = array([[cos(a), -sin(a)], [sin(a), cos(a)]]) invRot = rot.T invScale = diag((1.0, 0.5)) invTransform = dot(invScale, invRot) offset = c_in - dot(invTransform, c_out) dest = scipy.ndimage.interpolation.affine_transform( src, invTransform, order=2, offset=offset, output_shape=dest_shape, cval=0.0, output=float32 ) subplot(1, 7, i + 1);axis('off');imshow(dest, cmap=cm.gray) show()
If the image is to be first rotated, then stretched, the order of the dot product needs to be reversed:
invTransform = dot(invRot, invScale)