Simulating matlab's mrdivide with 2 square matrices

后端未结

关注

 2  1811

I have 2 19x19 square matrices (a & b) and I am trying to use the slash (mrdivide) operator to perform a division such that

c = a / b

相关标签:

2条回答

别那么骄傲

2021-01-06 16:02
In MATLAB, using mrdivide on two matrices of compatible dimension such that a / b is equivalent to a * b^{-1} where b^{-1} is the inverse of b. As such, what you could do is perhaps invert the matrix b first, then pre-multiply this a.

One method is to use cv::invert on the matrix b then pre-multiplying with a. This could be done with the following function definition (borrowing from your code in your post above):
```
cv::Mat mrdivide(const cv::Mat& A, const cv::Mat& B) 
{
    cv::Mat bInvert;
    cv::invert(B, bInvert);
    return A * bInvert;
}
```
Another way is to use the inv() method that's builtin to the cv::Mat interface and just use that and multiply the matrices themselves:
```
cv::Mat mrdivide(const cv::Mat& A, const cv::Mat& B) 
{
    return A * B.inv();
}
```
I'm not sure which one is faster, so you may have to do some tests, but either of the two methods should work. However, to provide a bit of insight in terms of possible timings, there are three ways to invert a matrix in OpenCV. You simply override the third parameter to cv::invert or specify the method in cv::Mat.inv() to facilitate this.

This StackOverflow post goes through the timings of inverting a matrix for a relatively large matrix size using the three methods: Fastest method in inverse of matrix
0 讨论(0)
发布评论:

提交评论
- 加载中...
生来不讨喜

2021-01-06 16:08
You should NOT be using inv to solve Ax=b or xA=b equations. While the two methods are mathematically equivalent (x=solve(A,b) and x=inv(A)*B), it's a completely different thing when working with floating-point numbers! http://www.johndcook.com/blog/2010/01/19/dont-invert-that-matrix/

As a general rule, never multiply by a matrix inverse. Instead use the forward/backward-slash operators (or the equivalent "solve" methods) for a one-off system, or explicitly perform matrix factorization (think LU, QR, Cholesky, etc..) when you want to reuse the same A with multiple b's

Let me give a concrete example to illustrate the problem with inverting. I'll be using MATLAB along with mexopencv, a library that allows us to call OpenCV directly from MATLAB.

(this example is borrowed from this excellent FEX submission by Tim Davis, the same guy behind SuiteSparse. I'm showing the case of left-division Ax=b but the same applies for the right-division xA=b).

Let's first build some matrices for the Ax=b system:
```
% Ax = b
N = 16;                 % square matrix dimensions
x0 = ones(N,1);         % true solution
A = gallery('frank',N); % matrix with ill-conditioned eigenvalues
b = A*x0;               % Ax=b system
```
Here's what the 16x16 matrix A and the 16x1 vector b look like (note that the true solution x0 is just a vector of 1's):
```
A =                                                          b =
   16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1              136
   15 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1              135
    0 14 14 13 12 11 10  9  8  7  6  5  4  3  2  1              119
    0  0 13 13 12 11 10  9  8  7  6  5  4  3  2  1              104
    0  0  0 12 12 11 10  9  8  7  6  5  4  3  2  1               90
    0  0  0  0 11 11 10  9  8  7  6  5  4  3  2  1               77
    0  0  0  0  0 10 10  9  8  7  6  5  4  3  2  1               65
    0  0  0  0  0  0  9  9  8  7  6  5  4  3  2  1               54
    0  0  0  0  0  0  0  8  8  7  6  5  4  3  2  1               44
    0  0  0  0  0  0  0  0  7  7  6  5  4  3  2  1               35
    0  0  0  0  0  0  0  0  0  6  6  5  4  3  2  1               27
    0  0  0  0  0  0  0  0  0  0  5  5  4  3  2  1               20
    0  0  0  0  0  0  0  0  0  0  0  4  4  3  2  1               14
    0  0  0  0  0  0  0  0  0  0  0  0  3  3  2  1                9
    0  0  0  0  0  0  0  0  0  0  0  0  0  2  2  1                5
    0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  1                2
```
Now let's compare cv::invert against cv::solve by finding the solution and computing the residual error using the NORM function (or cv::norm if you want):
```
% inverting (OpenCV)
x1 = cv.invert(A)*b;
r1 = norm(A*x1-b)

% inverting (MATLAB)
x2 = inv(A)*b;
r2 = norm(A*x2-b)

% solve using matrix factorization (OpenCV)
x3 = cv.solve(A,b);
r3 = norm(A*x3-b)

% solve using matrix factorization (MATLAB)
x4 = A\b;
r4 = norm(A*x4-b)
```
Below are the solutions found (I subtract 1 so you can see how far off they are from the true solution x0):
```
>> format short g
>> [x1 x2 x3 x4] - 1
ans =
   9.0258e-06   3.1086e-15  -1.1102e-16   2.2204e-16
   -0.0011101  -1.0181e-13  -2.2204e-15  -2.3315e-15
   -0.0016212  -2.5123e-12   3.3751e-14   3.3307e-14
    0.0037279   4.1745e-11  -4.3476e-13  -4.3487e-13
   -0.0022119   4.6216e-10   5.2165e-12    5.216e-12
   -0.0010476   1.3224e-09  -5.7384e-11  -5.7384e-11
    0.0035461   2.2614e-08   5.7384e-10   5.7384e-10
   -0.0040074  -4.1533e-07  -5.1646e-09  -5.1645e-09
    0.0036477   -4.772e-06   4.1316e-08   4.1316e-08
   -0.0033358   4.7499e-06  -2.8922e-07  -2.8921e-07
    0.0059112  -0.00010352   1.7353e-06   1.7353e-06
   -0.0043586   0.00044539  -8.6765e-06  -8.6764e-06
    0.0069238   -0.0024718   3.4706e-05   3.4706e-05
   -0.0019642   -0.0079952  -0.00010412  -0.00010412
    0.0039284      0.01599   0.00020824   0.00020823
   -0.0039284     -0.01599  -0.00020824  -0.00020823
```
And most importantly, here are the errors of each method:
```
r1 =
       0.1064
r2 =
     0.060614
r3 =
   1.4321e-14
r4 =
   1.7764e-15
```
The last two are orders of magnitude more accurate, not even close! And this was just with a system of 16 variables. Inverting is less numerically reliable especially when the matrices are large and sparse...

Now to answer your question, you had the right idea of using cv::solve, but you just got the order of operands wrong in the case of right-division.

In MATLAB, the operators / and \ (or mrdivide and mldivide) are related to each other by the equation B/A = (A'\B')' (this is a simple result of transpose properties).

So with OpenCV functions, you would write (note the order of A and b):
```
% Ax = b
x = cv.solve(A, b);     % A\b or mldivide(A,b)

% xA = b
x = cv.solve(A', b')';  % b/A or mrdivide(b,A)
```
The API exposed by OpenCV is a bit awkward here, so we had to do all these transposes. In fact if you refer to the equivalent LAPACK routines (think DGESV or DGESVX) they actually allow you to specify whether the matrix is transposed TRANS=T or not TRANS=N (at that level, transposition is really just a different memory layout, C or Fortran ordering). MATLAB for instance provides the linsolve function which allows you specify these sort of things in the options...

(BTW when coding in C++ OpenCV, I prefer to use the function-form of operations like cv::transpose as opposed to the matrix-expression variants like Mat::t. The former can operate in-place while the latter would create unecessary temporary copies).

Now if you're looking for a good performance linear-algebra implementation in C++, consider using Eigen (it even integrate nicely with OpenCV). Plus it's a pure template-based library, so no linking or binaries to worry about, just include the header files.

EDIT (in response to comments)

@Goz:

look up Return Value Optimisation. The "unnecessary temporary copies" don't exist

I'm aware of RVO and move semantics, but it's not of much importance here; The cv::Mat class is copy-friendly anyway, kinda like a reference-counted smart pointer. Meaning that it only does a shallow copy with data sharing when passed by-value. The only pieces that get created for the new copy are the parts in the mat header which is insignificant in terms of size (storing stuff like number of dimensions/channels, step sizes, and data type).

I was talking about an explicit deep copy made, not the one you're thinking about when returning from function calls...

Thanks to your comment, it got me motivated to actually dig through the OpenCV sources, which is not the easiest thing to read... Code has little to no comments, and can be hard to follow at times. The complexity is understandable seeing that OpenCV really cares about performance, and it's actually impressive that many functions are implemented in various ways (regular CPU implementation, loop unrolled version, SIMD vectorized versions (SSE, AVX, NEON, etc.), parallel and threaded versions using various backends, optimized implementations from Intel IPP, GPU accelerated versions with OpenCL or CUDA, mobile accelerated versions for Tegra, OpenVX, etc.)

Let's take the following case and trace our steps:
```
Mat A = ..., b = ..., x;
cv::solve(A.t(), b, x);
```
where the function is defined like:
```
bool cv::solve(InputArray _src, InputArray _src2arg, OutputArray _dst, int method)
{
    Mat src = _src.getMat(), _src2 = _src2arg.getMat();
    _dst.create( src.cols, _src2.cols, src.type() );
    Mat dst = _dst.getMat();
    ...
}
```
Now we have to figure out the steps in between. First thing we have is the t member method:
```
MatExpr Mat::t() const
{
    MatExpr e;
    MatOp_T::makeExpr(e, *this);
    return e;
}
```
This returns a MatExpr which is a class that allows lazy evaluation of matrix expressions. In other words, it will not perform the transpose right away, instead it stores a reference to the original matrix and the operation to eventually perform on it (transpose), but it will hold off on evaluating it until it is absolutely necessary (for example when assigned or cast to a cv::Mat).

Next let's see the definitions of the relevant parts. Note that in the actual code, these things are split across many files. I only pieced the interesting parts here for easier reading, but it's far from the complete thing:
```
class MatExpr
{
public:
    MatExpr()
    : op(0), flags(0), a(Mat()), b(Mat()), c(Mat()), alpha(0), beta(0), s()
    {}
    explicit MatExpr(const Mat& m)
    : op(&g_MatOp_Identity), flags(0), a(m), b(Mat()), c(Mat()),
      alpha(1), beta(0), s(Scalar())
    {}
    MatExpr(const MatOp* _op, int _flags, const Mat& _a = Mat(),
            const Mat& _b = Mat(), const Mat& _c = Mat(),
            double _alpha = 1, double _beta = 1, const Scalar& _s = Scalar())
    : op(_op), flags(_flags), a(_a), b(_b), c(_c), alpha(_alpha), beta(_beta), s(_s)
    {}
    MatExpr t() const
    {
        MatExpr e;
        op->transpose(*this, e);
        return e;
    }
    MatExpr inv(int method) const
    {
        MatExpr e;
        op->invert(*this, method, e);
        return e;
    }
    operator Mat() const
    {
        Mat m;
        op->assign(*this, m);
        return m;
    }
public:
    const MatOp* op;
    int flags;
    Mat a, b, c;
    double alpha, beta;
    Scalar s;
}

Mat& Mat::operator = (const MatExpr& e)
{
    e.op->assign(e, *this);
    return *this;
}
MatExpr operator * (const MatExpr& e1, const MatExpr& e2)
{
    MatExpr en;
    e1.op->matmul(e1, e2, en);
    return en;
}
```
So far this is straightforward. The class is supposed to store the input matrix in a (again cv::Mat instances will share data, so no copying), along with the operation to perform op, and a few other things not important for us.

Here's the matrix operation class MatOp, and some of it subclasses (I'm only showing the transpose and inverse operations, but there are more):
```
class MatOp
{
public:
    MatOp();
    virtual ~MatOp();
    virtual void assign(const MatExpr& expr, Mat& m, int type=-1) const = 0;
    virtual void transpose(const MatExpr& expr, MatExpr& res) const
    {
        Mat m;
        expr.op->assign(expr, m);
        MatOp_T::makeExpr(res, m, 1);
    }
    virtual void invert(const MatExpr& expr, int method, MatExpr& res) const
    {
        Mat m;
        expr.op->assign(expr, m);
        MatOp_Invert::makeExpr(res, method, m);
    }
}

class MatOp_T : public MatOp
{
public:
    MatOp_T() {}
    virtual ~MatOp_T() {}
    void assign(const MatExpr& expr, Mat& m, int type=-1) const
    {
        Mat temp, &dst = _type == -1 || _type == e.a.type() ? m : temp;
        cv::transpose(e.a, dst);
        if( dst.data != m.data || e.alpha != 1 ) dst.convertTo(m, _type, e.alpha);
    }
    void transpose(const MatExpr& e, MatExpr& res) const
    {
        if( e.alpha == 1 )
            MatOp_Identity::makeExpr(res, e.a);
        else
            MatOp_AddEx::makeExpr(res, e.a, Mat(), e.alpha, 0);
    }
    static void makeExpr(MatExpr& res, const Mat& a, double alpha=1)
    {
        res = MatExpr(&g_MatOp_T, 0, a, Mat(), Mat(), alpha, 0);
    }
};

class MatOp_Invert : public MatOp
{
public:
    MatOp_Invert() {}
    virtual ~MatOp_Invert() {}
    void assign(const MatExpr& e, Mat& m, int _type=-1) const
    {
        Mat temp, &dst = _type == -1 || _type == e.a.type() ? m : temp;
        cv::invert(e.a, dst, e.flags);
        if( dst.data != m.data ) dst.convertTo(m, _type);
    }
    void matmul(const MatExpr& e1, const MatExpr& e2, MatExpr& res) const
    {
        if( isInv(e1) && isIdentity(e2) )
            MatOp_Solve::makeExpr(res, e1.flags, e1.a, e2.a);
        else if( this == e2.op )
            MatOp::matmul(e1, e2, res);
        else
            e2.op->matmul(e1, e2, res);
    }
    static void makeExpr(MatExpr& res, int method, const Mat& m)
    {
        res = MatExpr(&g_MatOp_Invert, method, m, Mat(), Mat(), 1, 0);
    }
};

static MatOp_Identity g_MatOp_Identity;
static MatOp_T g_MatOp_T;
static MatOp_Invert g_MatOp_Invert;
```
OpenCV is heavily using operator overloading, so all kinds of operations like A+B, A-B, A*B, ... actually map to corresponding matrix expression operations.

The final part of the puzzle is the proxy class InputArray. It basically stores a void* pointer along with info about the thing passed (what kind it is: Mat, MatExpr, Matx, vector<T>, UMat, etc..), that way it knows how to cast the pointer back when requested with something like InputArray::getMat:
```
typedef const _InputArray& InputArray;

class _InputArray
{
public:
    _InputArray(const MatExpr& expr)
    { init(FIXED_TYPE + FIXED_SIZE + EXPR + ACCESS_READ, &expr); }

    void init(int _flags, const void* _obj)
    { flags = _flags; obj = (void*)_obj; }

    Mat getMat_(int i) const
    {
        int k = kind();
        int accessFlags = flags & ACCESS_MASK;
        ...
        if( k == EXPR ) {
            CV_Assert( i < 0 );
            return (Mat)*((const MatExpr*)obj);
        }
        ...
        return Mat();
    }
protected:
    int flags;
    void* obj;
    Size sz;
}
```
So now we see how Mat::t creates and returns a MatExpr instance. Which is then received by cv::solve as InputArray. Now when it calls InputArray::getMat to retrieve the matrix, it effectively converts the stored MatExpr to a Mat calling the cast operator:
```
    MatExpr::operator Mat() const
    {
        Mat m;
        op->assign(*this, m);
        return m;
    }
```
so it declares a new matrix m, calls MatOp_T::assign with the new matrix as destination. In turn this forces it to evaluate by finally calling cv::transpose. It computes the transposed result into this new matrix as destination.

So we end up having two copies, the original A and the transposed A.t() returned.

Now with all that said, compare it against:
```
Mat A = ..., b = ..., x;
cv::transpose(A, A);
cv::solve(A, b, x);
```
In this case, A is transposed in-place, and with less levels of abstraction.

Now the reason I showed all of that is not to argue about this one extra copy, after all it's not that big of a deal :) The really neat thing I found out is that the following two expressions are not doing the same thing and give different results (and I'm not talking about whether the inverse is in-place or not):
```
Mat A = ..., b = ..., x;
cv::invert(A,A);
x = A*b;

Mat A = ..., b = ..., x;
x = inv(A)*b;
```
It turns out that the second one is in fact smart enough to call cv::solve(A,b)! If you go back to MatOp_Invert::matmul (which is called when a lazy invert is later chained with another lazy matrix multiplication).
```
void MatOp_Invert::matmul(const MatExpr& e1, const MatExpr& e2, MatExpr& res) const
{
    if( isInv(e1) && isIdentity(e2) )
        MatOp_Solve::makeExpr(res, e1.flags, e1.a, e2.a);
    ...
}
```
it checks if the first operand in the expression inv(A)*B is an invert operation and the second operand is an identity operation (i.e a plain matrix, not the result of another complicated expression). In that case it changes the stored operation to a lazy solve operation MatOp_Solve (which similarly is a wrapper for the cv::solve function). IMO that's pretty smart! Even though you wrote inv(A)*b, it wont actually compute the inverse, and instead it understands that it is better to rewrite it by solving the system using matrix factorization.

Unfortunately for you, this will only benefit expressions of the form inv(A)*b not the other way around b*inv(A) (that one will end up computing the inverse which is not what we want). So in your case of solving xA=b, you should stick with explicitly calling cv::solve...

Of course this is only applicable when coding in C++ (thanks to the magic of operator overloading and lazy expressions). If you're using OpenCV from another language using some wrappers (like Python, Java, MATLAB), you're probably not getting any of that, and should be explicit in using cv::solve like I did in the previous MATLAB code, for both cases Ax=b and xA=b.

Hope this helps, and sorry for the long post ;)
0 讨论(0)
发布评论:

提交评论
- 加载中...

Simulating matlab's mrdivide with 2 square matrices

EDIT (in response to comments)