问题
So I want to create a histogram. Here is my code:
def histogram(s):
d = dict()
for c in s:
if c not in d:
d[c] = 1
else:
d[c] += 1
return d
def print_hist(h):
for c in h:
print c, h[c]
It give me this:
>>> h = histogram('parrot')
>>> print_hist(h)
a 1
p 1
r 2
t 1
o 1
But I want this:
a: 1
o: 1
p: 1
r: 2
t: 1
So how can I get my histogram in alphabetical order, be case sensitive (so "a" and "A" are the same), and list the whole alphabet (so letters that are not in the string just get a zero)?
回答1:
Use an ordered dictionary which store keys in the order they were put in.
from collections import OrderedDict
import string
def count(s):
histogram = OrderedDict((c,0) for c in string.lowercase)
for c in s:
if c in string.letters:
histogram[c.lower()] += 1
return histogram
for letter, c in count('parrot').iteritems():
print '{}:{}'.format(letter, c)
Result:
a:1
b:0
c:0
d:0
e:0
f:0
g:0
h:0
i:0
j:0
k:0
l:0
m:0
n:0
o:1
p:1
q:0
r:2
s:0
t:1
u:0
v:0
w:0
x:0
y:0
z:0
回答2:
Just use collections.Counter
for this, unless you really want your own:
>>> import collections
>>> c = collections.Counter('parrot')
>>> sorted(c.items(), key=lambda c: c[0])
[('a', 1), ('o', 1), ('p', 1), ('r', 2), ('t', 1)]
EDIT: As commenters pointed out, your last sentence indicates you want data on all the letters of the alphabet that do not occur in your word. Counter
is good for this also since, as the docs indicate:
Counter objects have a dictionary interface except that they return a zero count for missing items instead of raising a
KeyError
.
So you can just iterate through something like string.ascii_lowercase
:
>>> import string
>>> for letter in string.ascii_lowercase:
... print('{}: {}'.format(letter, c[letter]))
...
a: 1
b: 0
c: 0
d: 0
e: 0
f: 0
g: 0
h: 0
i: 0
j: 0
k: 0
l: 0
m: 0
n: 0
o: 1
p: 1
q: 0
r: 2
s: 0
t: 1
u: 0
v: 0
w: 0
x: 0
y: 0
z: 0
Finally, rather than implementing something complicated to merge the results of upper- and lowercase letters, just normalize your input first:
c = collections.Counter('PaRrOt'.lower())
回答3:
A trivial answer would be:
import string
for letter in string.ascii_lowercase:
print letter, ': ', h.lower().count(letter)
(highly inefficient as you go through the string 26 times)
Can also use a Counter
from collections import Counter
import string
cnt = Counter(h.lower())
for letter in string.ascii_lowercase:
print letter, ': ', cnt[letter]
Quite neater.
回答4:
If you want it ordered then you are going to have to use an ordereddictionary
You also are going to need to order the letters before you add them to the dictionary It is not clear to me I think you want a case insensitive result so we need to get all letters in one case
from collections import OrderedDict as od
import string
def histogram(s):
first we need to create the dictionary that has all of the lower case letters we imported string which will provide us a list but I think it is all lowercase including unicode so we need to only use the first 26 in string.lowercase
d = od()
for each_letter in string.lowercase[0:26]:
d[each_letter] = 0
Once the dictionary is created then we just need to iterate through the word after it has been lowercased. Please note that this will blow up with any word that has a number or a space. You may or may not want to test or add numbers and spaces to your dictionary. One way to keep it from blowing up is to try to add a value. If the value is not in the dictionary just ignore it.
for c in s.lower():
try:
d[c] += 1
except ValueError:
pass
return d
回答5:
Check this function for your output
def print_hist(h):
for c in sorted(h):
print c, h[c]
回答6:
If you want to list the whole (latin only) alphabet anyway, you could use a list of length 26:
hist = [0] * 26
for c in s.lower():
hist[orc(c) - ord('a')] += 1
To get the desired output:
for x in range(26):
print chr(x), ":", hist[x]
来源:https://stackoverflow.com/questions/22819315/creating-a-letter-a-histogram