df = pd.DataFrame({'name': ['A(上海)AAA', 'BB(上海)BB', 'CCC', 'DDD']
})
print(df)
# 结果如下
name
0 A(上海)AAA
1 BB(上海)BB
2 CCC
3 DDD
a = ['A(上海)AAA', 'BB(上海)BB', 'CCC']
dd = df[df.name.str.contains('|'.join(a))]
print(dd)
# 结果如下:
UserWarning: This pattern has match groups. To actually get the groups, use str.extract.
return func(self, *args, **kwargs)
name
2 CCC
问题分析:
原因是str.contains()不直接支持对象里边的括号
,因为括号
是正则表达式。
解决办法:
a = ['A\(上海\)AAA', 'BB\(上海\)BB', 'CCC'] # 使用转义符 “\”
# 或者
a= ['A.上海.AAA', 'BB.上海.BB', 'CCC']
dd = df[df.name.str.contains('|'.join(a))]
print(dd)
# 结果如下:
name # 没有UserWarning了
0 A(上海)AAA
1 BB(上海)BB
2 CCC
来源:https://blog.csdn.net/htuhxf/article/details/100980448