问题
I am trying to convert the following to list comprehension but struggling:
lorem_ipsum = """Lorem ipsum dolor sit amet, consectetur adipiscing elit."""
def word_count2(str):
counts = dict()
words = str.split()
for word in words:
if word in counts:
counts[word] += 1
else:
counts[word] = 1
return counts
print(word_count2(lorem_ipsum))
So far I have tried a few variations on this:-
aString = lorem_ipsum
counts = dict()
words = aString.split
[counts[word] += 1 if word in counts else counts[word] = 1 for word in words]
Unfortunately, it has been some hours now but nothing I have tried seems to work
回答1:
Warning! You are trying to use a side effect inside of a list comprehension:
[counts[word] += 1 if word in counts else counts[word] = 1 for word in words]
tries to update counts
for every word
. List comprehension are not meant to be used like that.
The class itertools.Counter
is designed to solve your problem, and you can use a dict comprehension that counts every element (see other answers). But the dict comprehension has a O(n^2) complexity: for every element of the list, read the full list to find that element. If you want something functional, use a fold:
>>> lorem_ipsum = """Lorem ipsum dolor sit amet, consectetur adipiscing elit."""
>>> import functools
>>> functools.reduce(lambda d, w: {**d, w: d.get(w, 0)+1}, lorem_ipsum.split(), {})
{'Lorem': 1, 'ipsum': 1, 'dolor': 1, 'sit': 1, 'amet,': 1, 'consectetur': 1, 'adipiscing': 1, 'elit.': 1}
For every word w
, we udpate the current dictionary: d[w]
is replaced by d[w]+1
(or 0+1
if w
was not in d
).
That gives a hint on how you could have written your list comprehension:
>>> counts = {}
>>> [counts.update({word: counts.get(word, 0) + 1}) for word in lorem_ipsum.split()]
[None, None, None, None, None, None, None, None]
>>> counts
{'Lorem': 1, 'ipsum': 1, 'dolor': 1, 'sit': 1, 'amet,': 1, 'consectetur': 1, 'adipiscing': 1, 'elit.': 1}
As you see, [None, None, None, None, None, None, None, None]
is the real return value of the list comprehension. The dictionary count
was updated but do not do this!. Do not use a list comprehension unless you use the result.
回答2:
Comprehensions aren't the right tool for this job. A collections.Counter
is:
>>> from collections import Counter
>>> counts = Counter(lorem_ipsum.split())
>>> print(counts)
Counter({'Lorem': 1, 'ipsum': 1, 'dolor': 1, 'sit': 1, 'amet,': 1, 'consectetur': 1, 'adipiscing': 1, 'elit.': 1})
>>> counts['Lorem']
1
>>> counts['foo']
0
回答3:
For this problem, you don't even need any list/dict comprehensions. Just use a collections.Counter.
from collections import Counter
counts = Counter(lorem_ipsum.split())
# >>> print(counts)
# Counter({'ipsum': 1, 'amet,': 1, 'sit': 1, 'elit.': 1, 'consectetur': 1, 'adipiscing': 1, 'dolor': 1, 'Lorem': 1})
If you really want to do it the old-fashioned way, you could do something like:
words = lorem_ipsum.split()
counts = { word: words.count(word) for word in words }
# >>> print(counts)
# {'ipsum': 1, 'amet,': 1, 'sit': 1, 'elit.': 1, 'consectetur': 1, 'adipiscing': 1, 'dolor': 1, 'Lorem': 1}
Also, don't use str
as a variable name. It shadows the built-in str function, which makes that function unusable and can lead to hard-to-debug errors.
回答4:
What you're really asking for is a dictionary comprehension, not a list comprehension. They're similar, but the syntax is a little different
# list comprehension
[foo for foo in stuff]
# dict comprehension
{key: val for key, val in some_tuple}
The trouble is, this won't work for the problem you're trying to solve.
Comprehensions work as either map
, where they make a new collection with each element transformed somehow, or a filter
, where there is a new collection with some elements possibly removed. These are stateless operations.
Word counting involves keeping track of things you've already seen. This is a reduce
operation, where you keep the state in some other data structure, counts
in your case.
As other answers have said, collections.Counter
is the "right" way to solve this problem.
With all that said, here's how to use a list comprehension to count words. Please don't try this at home (or at work...especially not at work...)
>>> lorem_ipsum = """
... Lorem ipsum dolor sit amet, consectetur adipiscing elit.
... """ * 2
>>> result = {}
>>> words = lorem_ipsum.split()
>>> [result.update({word: result.get(word, 0) + 1}) for word in words]
[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]
>>> result
{'consectetur': 2, 'ipsum': 2, 'amet,': 2, 'adipiscing': 2, 'sit': 2, 'elit.': 2, 'dolor': 2, 'Lorem': 2}
This works because a comprehension is basically a for loop behind the scenes, but you're still updating the state variable and just ignoring the actual list that gets created. In this case, it trades increased memory usage for readability, not a great choice.
回答5:
You can use count for this.
lorem_ipsum = """
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
"""
word = {word:lorem_ipsum.split().count(word) for word in lorem_ipsum.split()}
print word
来源:https://stackoverflow.com/questions/56011367/convert-forloop-to-list-comprehension