I'm trying to find the fastest way of standardizing a matrix in Matlab (zero mean, unit variance columns). It all comes down to which is the quickest way of applying the same operation to all rows in a matrix. Every post I've read come to the same conclusion: use bsxfun instead of repmat. This article, written by Mathworks is an example: http://blogs.mathworks.com/loren/2008/08/04/comparing-repmat-and-bsxfun-performance/
However, when trying this on my own computer repmat is always quicker. Here are my results using the same code as in the article:
m = 1e5;
n = 100;
A = rand(m,n);
frepmat = @() A - repmat(mean(A),size(A,1),1);
timeit(frepmat)
fbsxfun = @() bsxfun(@minus,A,mean(A));
timeit(fbsxfun)
Results:
ans =
0.0349
ans =
0.0391
In fact, I can never get bsxfun to perform better than repmat in this situation no matter how small or large the input matrix is.
Can someone explain this?
Most of the advice you're reading, including the blog post from Loren, likely refers to old versions of MATLAB, for which bsxfun
was quite a bit faster than repmat
. In R2013b (see the "Performance" section in the link), repmat
was reimplemented to give large performance improvements when applied to numeric, char and logical arguments. In recent versions, it can be about the same speed as bsxfun
.
For what it's worth, on my machine with R2014a I get
m = 1e5;
n = 100;
A = rand(m,n);
frepmat = @() A - repmat(mean(A),size(A,1),1);
timeit(frepmat)
fbsxfun = @() bsxfun(@minus,A,mean(A));
timeit(fbsxfun)
ans =
0.03756
ans =
0.034831
so it looks like bsxfun
is still a tiny bit faster, but not much - and on your machine it seems the reverse is the case. Of course, these results are likely to vary again, if you vary the size of A
or the operation you're applying.
There may still be other reasons to prefer one solution over the other, such as elegance (I prefer bsxfun
, if possible).
Edit: commenters have asked for a specific reason to prefer bsxfun
, implying that it might use less memory than repmat
by avoiding a temporary copy that repmat
does not.
I don't think this is actually the case. For example, open Task Manager (or the equivalent on Linux/Mac), watch the memory levels, and type:
>> m = 1e5; n = 8e3; A = rand(m,n);
>> B = A - repmat(mean(A),size(A,1),1);
>> clear B
>> C = bsxfun(@minus,A,mean(A));
>> clear C
(Adjust m
and n
until the jumps are visible in the graph, but not so big you run out of memory).
I see exactly the same behaviour from both repmat
and bsxfun
, which is that memory rises smoothly to the new level (basically double the size of A
) with no temporary additional peak.
This is also the case even if the operation is done in-place. Again, watch the memory and type:
>> m = 1e5; n = 8e3; A = rand(m,n);
>> A = A - repmat(mean(A),size(A,1),1);
>> clear all
>> m = 1e5; n = 8e3; A = rand(m,n);
>> A = bsxfun(@minus,A,mean(A));
Again, I see exactly the same behaviour from both repmat
and bsxfun
, which is that memory rises to a peak (basically double the size of A
), and then falls back to the previous level.
So I'm afraid I can't see much technical difference in terms of either speed or memory between repmat
and bsxfun
. My preference for bsxfun
is really just a personal preference as it feels a bit more elegant.
来源:https://stackoverflow.com/questions/28722723/matlab-bsxfun-no-longer-faster-than-repmat