argsort on a PyTables' array

二次信任 提交于 2019-12-12 06:12:14

问题


I have a problem with NumPy's argsort. It creates an int64 array of the length of the input array in-memory. Since I'm working with very large arrays, this will blow the memory.

I tested NumPy's argsort with a small PyTables' carray and it gives the correct output. Now, what I want is to the sorting algorithm work with a PyTables' array directly. Is there a way to do this with standard NumPy calls or a simple hack into the NumPy internals?

I'm also open to non-NumPy alternatives - I just want to get the job done!


回答1:


Since you are working with Pytables, I suggest you use the Table class which has sorting built in.

%pylab

import tables
#create description of your table
class Table_Description(tables.IsDescription):
    column_name = tables.Int64Col()   

#create hdf5 file and table
f=tables.open_file('test.h5',mode="w")
a=f.create_table("/","my_table",description=Table_Description)

# fill table
a.append(array([randint(0,99999) for i in xrange(10000)]))

#Create a full index (on disk if you use the tmp_dir parameter
a.cols.column_name.create_index(9,kind='full',tmp_dir="/tmp/")

#write changes to disc
a.flush()

#read indices that will sort the table
ind=f.root.my_table.cols.column_name.index
ind.read_indices()


来源:https://stackoverflow.com/questions/32312446/argsort-on-a-pytables-array

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!