NumPy equivalent of merge

后端未结

关注

 1  2013

I\'m transitioning some stuff from R to Python and am curious about merging efficiently. I\'ve found some stuff on concatenate in NumPy (using NumPy for operati

相关标签:

1条回答

情书的邮戳

2020-12-21 14:40

Here's one NumPy based solution using masking -

def numpy_merge_bycol0(d1, d2):
    # Mask of matches in d1 against d2
    d1mask = np.isin(d1[:,0], d2[:,0])

    # Mask of matches in d2 against d1
    d2mask = np.isin(d2[:,0], d1[:,0])

    # Mask respective arrays and concatenate for final o/p
    return np.c_[d1[d1mask], d2[d2mask,1:]]

Sample run -

In [43]: d1
Out[43]: 
array([['1a2', '0'],
       ['2dd', '0'],
       ['z83', '1'],
       ['fz3', '0']], dtype='|S3')

In [44]: d2
Out[44]: 
array([['1a2', '33.3', '22.2'],
       ['43m', '66.6', '66.6'],
       ['z83', '12.2', '22.1']], dtype='|S4')

In [45]: numpy_merge_bycol0(d1, d2)
Out[45]: 
array([['1a2', '0', '33.3', '22.2'],
       ['z83', '1', '12.2', '22.1']], dtype='|S4')

We could also use broadcasting to get the indices and then integer-indexing in place of masking, like so -

idx = np.argwhere(d1[:,0,None] == d2[:,0])
out = np.c_[d1[idx[:,0]], d2[idx[:,0,1:]

0 讨论(0)