问题
supposed dataset,
Name Value
0 K Ieatapple
1 Y bananaisdelicious
2 B orangelikesomething
3 Q bluegrape
4 C appleislike
and I have keyword list like
[apple, banana]
In this dataset, matching column 'Value' - [keyword list]
*I mean matching is keyword in list in 'Value'
I would like to see how the keywords in the list match column, so.. I want to find out how much the matching rate is.
Ultimately, what I want to know is 'Finding match rate between keywords and columns' Percentage, If I can, filtered dataframe
Thank you.
Edit
In my real dataset, There are keywords in the sentence,
Ex,
Ilikeapplethanbananaandorange
so It doesn`t work if use keyword - keyword matching(1:1).
回答1:
Use str.contains
to match words to your sentences:
keywords = ['apple', 'banana']
df['Value'].str.contains("|".join(keywords)).sum() / len(df)
# 0.6
Or if you want to keep the rows:
df[df['Value'].str.contains("|".join(keywords))]
Name Value
0 K I eat apple
1 Y banana is delicious
4 C appleislike
More details
The pipe |
is the or
operator in regular expression:
So we join our list of words with a pipe to match one of these words:
>>> keywords = ['apple', 'banana']
>>> "|".join(keywords)
'apple|banana'
So in regular expression we have the statement now:
match rows where the sentence contains "apple" OR "banana"
来源:https://stackoverflow.com/questions/60032032/python-keyword-matchingkeyword-list-column