Strange behaviour when computing svd on a covariance matrix: different results between Microsoft R and vanilla R

巧了我就是萌 提交于 2020-01-06 22:51:30

问题


I was doing some principal component analysis on my macbook running Microsoft R 3.3.0 when I got some strange results. Double checking with a colleague, I've realised that the output of the SVD function was different from what I may get by using vanilla R.

This is the reproducible result, please load the file (~78 Mb) here

With Microsoft R 3.3.0 (x86_64-apple-darwin14.5.0) I get:

>> sv <- svd(Cx)
>> print(sv$d[1:10])

 [1] 122.73664 104.45759  90.52001  87.21890  81.28256  74.33418      73.29427  66.26472  63.51379
[10]  55.20763

Instead on a vanilla R (both with R 3.3 and R 3.3.1 on two different linux machines):

>> sv <- svd(Cx)
>> print(sv$d[1:10])

 [1] 122.73664  34.67177  18.50610  14.04483   8.35690   6.80784   6.14566
 [8]   3.91788   3.76016   2.66381

This is not happening with all the data, if I create some random matrix and I apply svd on that, I get the same results. So, it looks like a sort of numerical instability, isn't it?

UPDATE: I've tried to compute the SVD on the same matrix (Cx) on the same machine (macbook) with the same version of R by using the svd package and finally I get the "right" numbers. Then it seems due to the svd implementation used by Microsoft R Open.

UPDATE: The behaviour happens also on MRO 3.3.1


回答1:


The typical example forms an ill-conditioned matrix. There are some SV closest to zero making the SVD decomposition numerical sensitive to different implementations of the SVD, which is probably what you are seen




回答2:


It seems this is a sort of bug, as confirmed in the Github of microsoft-r-open. They say this behaviour is under investigation and it's related with the Accelerate library in MacOs.



来源:https://stackoverflow.com/questions/40052770/strange-behaviour-when-computing-svd-on-a-covariance-matrix-different-results-b

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!