问题
I have a large dataset of around 200000
data points where each data point contains 132
features. So basically my dataset is 200000 x 132
.
I have done all the computations by using the armadillo framework. However, I have tried to do PCA analysis but I received a memory error which I don't know that it's because of my RAM memory( 8 GB of Ram ) or its a limitation due to the framework itself.
I receive the following error : requested size is too large
.
Can you recommend me another framework for PCA computation which doesn't have size/memory limtations?
Or if you have previously used armadillo for PCA computation and encountered this issue, can you tell me how you managed to solve it?
回答1:
You probably need to enable the use of 64 bit integers within Armadillo, which are used for storing the total number of elements, etc.
Specifically, edit the file
include/armadillo_bits/config.hpp
and uncomment the line with: // #define ARMA_64BIT_WORD
.
In version 3.4 this should be near line 59.
Alternatively, you can define ARMA_64BIT_WORD before including the Armadillo header in your program, eg:
#define ARMA_64BIT_WORD
#include <armadillo>
#include <iostream>
...
Note that your C++ compiler must be able to handle 64 bit integers. Most compilers these days have it.
来源:https://stackoverflow.com/questions/13480410/c-framework-for-computing-pca-other-than-armadillo