Find and replace multiple values in python

前端未结

关注

 6  1214

I want to find and replace multiple values in an 1D array / list with new ones.

In example for a list

a=[2, 3, 2, 5, 4, 4, 1, 2]

I woul

相关标签:

6条回答

刺人心

2021-02-15 04:38
The numpy_indexed package (disclaimer: I am its author) provides an elegant and efficient vectorized solution to this type of problem:
```
import numpy_indexed as npi
remapped_a = npi.remap(a, val_old, val_new)
```
The method implemented is based on searchsorted like that of swenzel and should have similar good performance, but more general. For instance, the items of the array do not need to be ints, but can be any type, even nd-subarrays themselves.

If all values in 'a' are expected to be present in 'val_old', you can set the optional 'missing' kwarg to 'raise' (default is 'ignore'). Performance will be slightly better, and you will get a KeyError if that assumption is not satisfied.
0 讨论(0)
发布评论:

提交评论
- 加载中...
闹比i

2021-02-15 04:43
To replace values in a list using two other lists as key:value pairs there are several approaches. All of them use "list compression".

Using list.index():
```
a=[2, 3, 2, 5, 4, 4, 1, 2]
val_old=[1, 2, 3, 4, 5] 
val_new=[2, 3, 4, 5, 1]
a_new=[val_new[val_old.index(x)] for x in a]
```
Using your special case:
```
a=[2, 3, 2, 5, 4, 4, 1, 2]
a_new=[x % 5 + 1 for x in a]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
死守一世寂寞

2021-02-15 04:47
Try this for your expected output, works even if elements not in value_old.
```
>>>[val_new[val_old.index(i)] if i in val_old else i for i in a]
[3, 4, 3, 1, 5, 5, 2, 3]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

攒了一身酷

2021-02-15 04:50

Assuming that your val_old array is sorted (which is the case here, but if later on it's not, then don't forget to sort val_new along with it!), you can use numpy.searchsorted and then access val_new with the results.
This does not work if a number has no mapping, you will have to provide 1to1 mappings in that case.

In [1]: import numpy as np

In [2]: a = np.array([2, 3, 2, 5, 4, 4, 1, 2])

In [3]: old_val = np.array([1, 2, 3, 4, 5])

In [4]: new_val = np.array([2, 3, 4, 5, 1])

In [5]: a_new = np.array([3, 4, 3, 1, 5, 5, 2, 3])

In [6]: i = np.searchsorted(old_val,a)

In [7]: a_replaced = new_val[i]

In [8]: all(a_replaced == a_new)
Out[8]: True

50k numbers? No problem!

In [23]: def timed():
    t0 = time.time()
    i = np.searchsorted(old_val, a)
    a_replaced = new_val[i]
    t1 = time.time()
    print('%s Seconds'%(t1-t0))
   ....: 

In [24]: a = np.random.choice(old_val, 50000)

In [25]: timed()
0.00288081169128 Seconds

500k? You won't notice the difference!

In [26]: a = np.random.choice(old_val, 500000)

In [27]: timed()
0.019248008728 Seconds

0 讨论(0)

萌比男神i

2021-02-15 04:52
In vanilla Python, without the speed of numpy or pandas, this is one way:
```
a = [2, 3, 2, 5, 4, 4, 1, 2]
val_old = [1, 2, 3, 4, 5]
val_new = [2, 3, 4, 5, 1]
expected_a_new = [3, 4, 3, 1, 5, 5, 2, 3]
d = dict(zip(val_old, val_new))
a_new = [d.get(e, e) for e in a]
print a_new # [3, 4, 3, 1, 5, 5, 2, 3]
print a_new == expected_a_new # True
```
The average time complexity for this algorithm is O(M + N) where M is the length of your "translation list" and N is the length of list a.
0 讨论(0)
发布评论:

提交评论
- 加载中...

一整个雨季

2021-02-15 04:55

>>> arr = np.empty(a.max() + 1, dtype=val_new.dtype)
>>> arr[val_old] = val_new
>>> arr[a]
array([3, 4, 3, 1, 5, 5, 2, 3])

0 讨论(0)