How to make a histogram from a list of strings in Python?

前端 未结 8 1216
说谎
说谎 2020-12-03 04:29

I have a list of strings:

a = [\'a\', \'a\', \'a\', \'a\', \'b\', \'b\', \'c\', \'c\', \'c\', \'d\', \'e\', \'e\', \'e\', \'e\', \'e\']

I w

相关标签:
8条回答
  • 2020-12-03 05:01

    Check out matplotlib.pyplot.bar. There is also numpy.histogram which is more flexible if you want wider bins.

    0 讨论(0)
  • 2020-12-03 05:04

    here's a concise all-pandas approach:

    a = ['a', 'a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'd', 'e', 'e', 'e', 'e', 'e']
    pd.Series(a).value_counts().plot('bar')
    

    0 讨论(0)
  • 2020-12-03 05:04

    Using numpy

    Using numpy 1.9 or greater:

    import numpy as np
    a = ['a', 'a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'd', 'e', 'e', 'e', 'e', 'e']
    labels, counts = np.unique(a,return_counts=True)
    

    This can be plotted using:

    import matplotlib.pyplot as plt 
    ticks = range(len(counts))
    plt.bar(ticks,counts, align='center')
    plt.xticks(ticks, labels)
    

    0 讨论(0)
  • 2020-12-03 05:07

    Simple and effective way to make character histrogram in python

    import numpy as np
    
    import matplotlib.pyplot as plt
    
    from collections import Counter
    
    
    
    a = []
    count =0
    d = dict()
    filename = raw_input("Enter file name: ")
    with open(filename,'r') as f:
        for word in f:
            for letter  in word:
                if letter not in d:
                    d[letter] = 1
                else:
                    d[letter] +=1
    num = Counter(d)
    x = list(num.values())
    y = list(num.keys())
    
    x_coordinates = np.arange(len(num.keys()))
    plt.bar(x_coordinates,x)
    plt.xticks(x_coordinates,y)
    plt.show()
    print x,y

    0 讨论(0)
  • 2020-12-03 05:10

    Very easy with Pandas.

    import pandas
    from collections import Counter
    a = ['a', 'a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'd', 'e', 'e', 'e', 'e', 'e']
    letter_counts = Counter(a)
    df = pandas.DataFrame.from_dict(letter_counts, orient='index')
    df.plot(kind='bar')
    

    Notice that Counter is making a frequency count, so our plot type is 'bar' not 'hist'.

    histogram of letter counts

    0 讨论(0)
  • 2020-12-03 05:12

    Rather than use groupby() (which requires your input to be sorted), use collections.Counter(); this doesn't have to create intermediary lists just to count inputs:

    from collections import Counter
    
    counts = Counter(a)
    

    You haven't really specified what you consider to be a 'histogram'. Lets assume you wanted to do this on the terminal:

    width = 120  # Adjust to desired width
    longest_key = max(len(key) for key in counts)
    graph_width = width - longest_key - 2
    widest = counts.most_common(1)[0][1]
    scale = graph_width / float(widest)
    
    for key, size in sorted(counts.items()):
        print('{}: {}'.format(key, int(size * scale) * '*'))
    

    Demo:

    >>> from collections import Counter
    >>> a = ['a', 'a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'd', 'e', 'e', 'e', 'e', 'e']
    >>> counts = Counter(a)
    >>> width = 120  # Adjust to desired width
    >>> longest_key = max(len(key) for key in counts)
    >>> graph_width = width - longest_key - 2
    >>> widest = counts.most_common(1)[0][1]
    >>> scale = graph_width / float(widest)
    >>> for key, size in sorted(counts.items()):
    ...     print('{}: {}'.format(key, int(size * scale) * '*'))
    ... 
    a: *********************************************************************************************
    b: **********************************************
    c: **********************************************************************
    d: ***********************
    e: *********************************************************************************************************************
    

    More sophisticated tools are found in the numpy.histogram() and matplotlib.pyplot.hist() functions. These do the tallying for you, with matplotlib.pyplot.hist() also providing you with graph output.

    0 讨论(0)
提交回复
热议问题