How to de-reference a list of external links using pytables?

六眼飞鱼酱① 提交于 2020-01-06 05:45:04

问题


I have created external links leading from one hdf5 file to another using pytables. My question is how to de-reference it in a loop?

for example:

Let's assume file_name = "collection.h5", where external links are stored

I created external links under the root node and when i traverse the nodes under the root, i get the following output :

/link1 (ExternalLink) -> /files/data1.h5:/weights/Image
/link2 (ExternalLink) -> /files/data2.h5:/weights/Image

and so on,

I know that for de-referencing a link, it can be done like this, using natural naming in the below manner:

f = open_file('collection.h5',mode='r')
plink1 = f.root.link1()
plink2 = f.root.link2()

but I want to do this in a for-loop, any help regarding this?


回答1:


This is a more complete (robust and complicated) answer to handle the general condition when you have an ExternalLink at any group level. It is similar to above, but uses walk_nodes() because it has 3 groups at the root level, and includes a test for ExternalLink types (see isinstance()). Also, it shows how to use the _v_children attribute to get a dictionary of nodes. (I couldn't get list_nodes() to work with an ExternalLink.)

import tables as tb
import glob

h5f = tb.open_file('collection.h5',mode='w')
link_cnt = 0
pre_list = ['SO_53', 'SO_54', 'SO_55']
for h5f_pre in pre_list :
    h5f_pre_grp = h5f.create_group('/', h5f_pre)
    for h5name in glob.glob('./'+h5f_pre+'*.h5'):
        link_cnt += 1
        h5f.create_external_link(h5f_pre_grp, 'link_'+'%02d'%(link_cnt), h5name+':/')
h5f.close()

h5f = tb.open_file('collection.h5',mode='r')
for link_node in h5f.walk_nodes('/') : 
    if isinstance(link_node, tb.link.ExternalLink) :
        print('\nFor Node %s:' % (link_node._v_pathname) )
        print("``%s`` is an external link to: ``%s``" % (link_node, link_node.target))
        plink = link_node(mode='r') # this returns a file object for the linked file
        linked_nodes = plink._v_children
        print (linked_nodes)

h5f.close()



回答2:


You can use iter_nodes() or walk_nodes(); walk_nodes is recursive, iter_nodes is not. An example of iter_nodes() is explained in my answer to this SO topic: cannot-retrieve-datasets-in-pytables-using-natural-naming I discovered you can't use get_node() to reference an ExternalLink. You need to reference differently.

Here's a simple example that creates collection.h5 from a list of HDF5 files in my local folder, then uses iter_nodes() in a for loop. Note that this is a very basic example. It does not check the Node's object type (Group or Leaf or ExternalLink). It assumes each Node at the root level is an ExternalLink, and creates a file object from the node. There are additional PyTables methods and attributes to check for these situations. See detailed answer below for a more robust (complicated) method.

import tables as tb
import glob

h5f = tb.open_file('collection.h5',mode='w')
link_cnt = 0 
for h5name in glob.glob('./SO*.h5'):
    link_cnt += 1
    h5f.create_external_link('/', 'link'+str(link_cnt), h5name+':/')
h5f.close()

h5f = tb.open_file('collection.h5',mode='r')
for link_node in h5f.iter_nodes('/') : 
    print("``%s`` is an external link to: ``%s``" % (link_node, link_node.target))
    plink = link_node(mode='r') # returns a FILE object

h5f.close()


来源:https://stackoverflow.com/questions/55391339/how-to-de-reference-a-list-of-external-links-using-pytables

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!