Python: passing multiple LARGE sequences through hmmlearn

问题

So I am trying to train a very large feature array for hidden markov model: 700 x (400 x 4122), where each 400x4122 mini-array is a sequence of observed samples across 400 time stamps with 4122 features. There is a total of 700 such sequences, which amounts to ~45GB of memory, when concatenated. My question is: how do you work with array of this size?

In the hmmlearn python package, one typically work with multiple sequences as follows:

x1 -> a 400x4122 sequence

x2 -> another 400x4122 sequence

...

xn -> 700th 400x4122 sequence

X = np.concatenate(x1, x2, ..., xn)

lengths = [len(x1), len(x2),..., len(xn)]

model = GaussianHMM(n_component = 6, ...).fit(X, length = lengths)

In other words, one needs to concatenate the entire array of sequences and feed into the training function. However, I was wondering if there is a way to feed one 400x4122 sequence at a time, as the entire concatenated array is way too large to work with.

Thanks in advance.

来源：https://stackoverflow.com/questions/40294642/python-passing-multiple-large-sequences-through-hmmlearn

标签

python

arrays

scikit-learn

hmmlearn

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!