Trying to count words in a string

后端未结

关注

 7  790

I\'m trying to analyze the contents of a string. If it has a punctuation mixed in the word I want to replace them with spaces.

For example, If Johnny.Appleseed!is:a*good

相关标签:

7条回答

情话喂你

2021-02-06 03:48

for ltr in ('!', '.', ...) # insert rest of punctuation
     stringss = strings.replace(ltr, ' ')
return len(stringss.split(' '))

0 讨论(0)

遥遥无期

2021-02-06 03:54

I know that this is an old question but...How about this?

string = "If Johnny.Appleseed!is:a*good&farmer"

a = ["*",":",".","!",",","&"," "]
new_string = ""

for i in string:
   if i not in a:
      new_string += i
   else:
      new_string = new_string  + " "

print(len(new_string.split(" ")))

0 讨论(0)

醉话见心

2021-02-06 03:54

Simple loop based solution:

strs = "Johnny.Appleseed!is:a*good&farmer"
lis = []
for c in strs:
    if c.isalnum() or c.isspace():
        lis.append(c)
    else:
        lis.append(' ')

new_strs = "".join(lis)
print new_strs           #print 'Johnny Appleseed is a good farmer'
new_strs.split()         #prints ['Johnny', 'Appleseed', 'is', 'a', 'good', 'farmer']

Better solution:

Using regex:

>>> import re
>>> from string import punctuation
>>> strs = "Johnny.Appleseed!is:a*good&farmer"
>>> r = re.compile(r'[{}]'.format(punctuation))
>>> new_strs = r.sub(' ',strs)
>>> len(new_strs.split())
6
#using `re.split`:
>>> strs = "Johnny.Appleseed!is:a*good&farmer"
>>> re.split(r'[^0-9A-Za-z]+',strs)
['Johnny', 'Appleseed', 'is', 'a', 'good', 'farmer']

0 讨论(0)

逝去的感伤

2021-02-06 03:56
How about using Counter from collections ?
```
import re
from collections import Counter

words = re.findall(r'\w+', string)
print (Counter(words))
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
伪装坚强ぢ

2021-02-06 04:01
try this: it parses the word_list using re, then creates a dictionary of word:appearances
```
import re
word_list = re.findall(r"[\w']+", string)
print {word:word_list.count(word) for word in word_list}
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
长情又很酷

2021-02-06 04:05
Here's a one-line solution that doesn't require importing any libraries.
It replaces non-alphanumeric characters (like punctuation) with spaces, and then splits the string.

Inspired from "Python strings split with multiple separators"
```
>>> s = 'Johnny.Appleseed!is:a*good&farmer'
>>> words = ''.join(c if c.isalnum() else ' ' for c in s).split()
>>> words
['Johnny', 'Appleseed', 'is', 'a', 'good', 'farmer']
>>> len(words)
6
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页