pytables

Pip does not acknowledge Cython

十年热恋 提交于 2019-12-21 04:38:13
问题 I just installed pip and Python via home-brew on a fresh Mac OS installation. First of all, my pip is not installing dependencies at all - which forces me to re-run 'pip install tables' 3 times and every time it will tell me a dependency and I will install that and then rerun it again. Is this expected behavior? Second, it does not accept the installation of Cython that it installed itself moments ago: $ pip show cython --- Name: Cython Version: 0.21 Location: /usr/local/lib/python2.7/site

Release hdf5 disk memory after table or node removal with pytables or pandas

非 Y 不嫁゛ 提交于 2019-12-19 08:09:09
问题 I'm using HDFStore with pandas / pytables. After removing a table or object, hdf5 file size remains unaffected. It seems this space is reused afterwards when additional objects are added to store, but it can be an issue if large space is wasted. I have not found any command in pandas nor pytables APIs that might be used to recover hdf5 memory. Do you know of any mechanism to improve data management in hdf5 files? 回答1: see here you need to ptrepack it, which rewrites the file. ptrepack -

What is the advantage of PyTables? [closed]

风格不统一 提交于 2019-12-19 04:25:10
问题 As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 6 years ago . I have recently started learning about PyTables and found it very interesting. My question is: What are the basic advantages of

How should I use the h5py library for storing time series data?

假装没事ソ 提交于 2019-12-17 17:08:12
问题 I have some time series data that i previously stored as hdf5 files using pytables . I recently tried storing the same with h5py lib. However, since all elements of numpy array have to be of same dtype, I have to convert the date (which is usually the index) into ' float64 ' type before storing it using h5py lib. When I use pytables , the index and its dtype are preserved which makes it possible for me to query the time-series without the need of pulling it all in the memory. I guess with

How to get faster code than numpy.dot for matrix multiplication?

回眸只為那壹抹淺笑 提交于 2019-12-17 10:37:45
问题 Here Matrix multiplication using hdf5 I use hdf5 (pytables) for big matrix multiplication, but I was suprised because using hdf5 it works even faster then using plain numpy.dot and store matrices in RAM, what is the reason of this behavior? And maybe there is some faster function for matrix multiplication in python, because I still use numpy.dot for small block matrix multiplication. here is some code: Assume matrices can fit in RAM: test on matrix 10*1000 x 1000. Using default numpy(I think

Python: how to store a numpy multidimensional array in PyTables?

折月煮酒 提交于 2019-12-17 08:53:15
问题 How can I put a numpy multidimensional array in a HDF5 file using PyTables? From what I can tell I can't put an array field in a pytables table. I also need to store some info about this array and be able to do mathematical computations on it. Any suggestions? 回答1: There may be a simpler way, but this is how you'd go about doing it, as far as I know: import numpy as np import tables # Generate some data x = np.random.random((100,100,100)) # Store "x" in a chunked array... f = tables.open_file

Pandas HDF5 Select with Where on non natural-named columns

佐手、 提交于 2019-12-14 03:52:58
问题 in my continuing spree of exotic pandas/HDF5 issues, I encountered the following: I have a series of non-natural named columns (nb: because of a good reason, with negative numbers being "system" ids etc), which normally doesn't give an issue: fact_hdf.select('store_0_0', columns=['o', 'a-6', 'm-13']) however, my select statement does fall over it: >>> fact_hdf.select('store_0_0', columns=['o', 'a-6', 'm-13'], where=[('a-6', '=', [0, 25, 28])]) blablabla File "/srv/www/li/venv/local/lib

Store pandas DataFrame in PyTables table without storing index

妖精的绣舞 提交于 2019-12-13 18:41:40
问题 In many DataFrame.to_foo functions I can specify that I don't want to write the index >>> help(df.to_csv) Write DataFrame to a comma-separated values (csv) file Parameters ---------- ... index : boolean, default True Write row names (index) ... Does similar functionality exist for DataFrame.to_hdf ? I would like to not store the index in the PyTables table. 回答1: You could call out to h5py and interact with HDF5 directly. data = df.values with h5py.File('data.h5','w') as f: f.create_dataset(

How can the shape of a pytables table column be defined by a variable?

 ̄綄美尐妖づ 提交于 2019-12-13 08:46:37
问题 I'm trying to create an IsDescription subclass, so that I can define the structure of a table I'm trying to create. One of the attributes of the subclass** needs to be shaped given a certain length that is unknown until runtime (it depends on a file being parsed), but is fixed at runtime. Sample code: import tables class MyClass(tables.IsDescription): def __init__(self, param): var1 = tables.Float64Col(shape=(param)) MyClass1 = MyClass(12) Which returns: TypeError: object.__new__() takes no

more efficient solution for QTableWidget write

守給你的承諾、 提交于 2019-12-13 04:32:43
问题 I am reading a PyTable, with 1320000rows x 16cols The idea is to read the table and to write its content into a QTableWidget. The way I am doing it makes the GUI collapse. I would like a clue about how to do it in an efficient way. Here it is my code: #The PyTable is already opened and the reference to the desired table acquired self.ui.tableWidget.setRowCount(tab.nrows) self.ui.tableWidget.setColumnCount(len(tab.colnames)) self.ui.tableWidget.setHorizontalHeaderLabels(tab.colnames) res = []