I was doing some principal component analysis on my macbook running Microsoft R 3.3.0 when I got some strange results. Double checking with a colleague, I've realised that the output of the SVD function was different from what I may get by using vanilla R.
This is the reproducible result, please load the file (~78 Mb) here
With Microsoft R 3.3.0 (x86_64-apple-darwin14.5.0) I get:
>> sv <- svd(Cx)
>> print(sv$d[1:10])
[1] 122.73664 104.45759 90.52001 87.21890 81.28256 74.33418 73.29427 66.26472 63.51379
[10] 55.20763
Instead on a vanilla R (both with R 3.3 and R 3.3.1 on two different linux machines):
>> sv <- svd(Cx)
>> print(sv$d[1:10])
[1] 122.73664 34.67177 18.50610 14.04483 8.35690 6.80784 6.14566
[8] 3.91788 3.76016 2.66381
This is not happening with all the data, if I create some random matrix and I apply svd on that, I get the same results. So, it looks like a sort of numerical instability, isn't it?
UPDATE: I've tried to compute the SVD on the same matrix (Cx
) on the same machine (macbook) with the same version of R by using the svd
package and finally I get the "right" numbers. Then it seems due to the svd implementation used by Microsoft R Open.
UPDATE: The behaviour happens also on MRO 3.3.1
The typical example forms an ill-conditioned matrix. There are some SV closest to zero making the SVD decomposition numerical sensitive to different implementations of the SVD, which is probably what you are seen
It seems this is a sort of bug, as confirmed in the Github of microsoft-r-open. They say this behaviour is under investigation and it's related with the Accelerate library in MacOs.