问题
df=pd.DataFrame({"A":["one","two","three"],"B":["fopur","give","six"]})
when I do,
df.B.str.contains("six").any()
out[2]=True
when I do,
df.B.str.contains("six)").any()
I am getting the below error,
C:\ProgramData\Anaconda3\lib\sre_parse.py in parse(str, flags, pattern)
868 if source.next is not None:
869 assert source.next == ")"
--> 870 raise source.error("unbalanced parenthesis")
871
872 if flags & SRE_FLAG_DEBUG:
error: unbalanced parenthesis at position 3
Please help!
回答1:
You can set regex=False
in in pandas.Series.str.contains
:
df.B.str.contains("six)", regex=False).any()
If you want to match irrespective of case,
df.B.str.contains("Six)", case=False, regex=False).any()
out[]: True
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.contains.html
Info:
Parenthesis are special characters in regular expressions that need to be "escaped", see for example here or here.
回答2:
You need escape )
by \
because special regex character:
df.B.str.contains("six\)").any()
More general:
import re
df.B.str.contains(re.escape("six)")).any()
来源:https://stackoverflow.com/questions/48699907/error-unbalanced-parenthesis-while-checking-if-an-item-presents-in-a-pandas-d