In matplotlib.pyplot, how to plot two datasets using interleaved bar graphs?

后端 未结 4 1709
一向
一向 2021-01-02 02:07

I am going through Think Stats and I would like to compare multiple data sets visually. I can see from the book examples that it is possible to generate an interleaved bar g

相关标签:
4条回答
  • 2021-01-02 02:24

    Call the bar function multiple times, one for each series. You can control the left position of the bars using the left parameter, and you can use this to prevent overlap.

    Entirely untested code:

    pyplot.bar( numpy.arange(10) * 2, data1, color = 'red' )
    pyplot.bar( numpy.arange(10) * 2 + 1, data2, color = 'red' )
    

    Data2 will be drawn shifted over the right compared to where data one will be drawn.

    0 讨论(0)
  • 2021-01-02 02:30

    There's a brilliant example/demo provided in the documentation:

    http://matplotlib.sourceforge.net/examples/api/barchart_demo.html

    0 讨论(0)
  • 2021-01-02 02:34

    Matplotlib's example code for interleaved bar charts works nicely for arbitrary real-valued x coordinates (as mentioned by @db42).

    However, if your x coordinates are categorical values (like in the case of dictionaries in the linked question), the conversion from categorical x coordinates to real x coordinates is cumbersome and unnecessary.

    You can plot two dictionaries side-by-side directly using matplotlib's api. The trick for plotting two bar charts with an offset to each other is to set align=edge and a positive width (+width) for plotting one bar chart, whereas a negative width (-width) for plotting the other one.

    The example code modified for plotting two dictionaries looks like the following then:

    """
    ========
    Barchart
    ========
    
    A bar plot with errorbars and height labels on individual bars
    """
    import matplotlib.pyplot as plt
    
    # Uncomment the following line if you use ipython notebook
    # %matplotlib inline
    
    width = 0.35       # the width of the bars
    
    men_means = {'G1': 20, 'G2': 35, 'G3': 30, 'G4': 35, 'G5': 27}
    men_std = {'G1': 2, 'G2': 3, 'G3': 4, 'G4': 1, 'G5': 2}
    
    rects1 = plt.bar(men_means.keys(), men_means.values(), -width, align='edge',
                    yerr=men_std.values(), color='r', label='Men')
    
    women_means = {'G1': 25, 'G2': 32, 'G3': 34, 'G4': 20, 'G5': 25}
    women_std = {'G1': 3, 'G2': 5, 'G3': 2, 'G4': 3, 'G5': 3}
    
    rects2 = plt.bar(women_means.keys(), women_means.values(), +width, align='edge',
                    yerr=women_std.values(), color='y', label='Women')
    
    # add some text for labels, title and axes ticks
    plt.xlabel('Groups')
    plt.ylabel('Scores')
    plt.title('Scores by group and gender')
    plt.legend()
    
    def autolabel(rects):
        """
        Attach a text label above each bar displaying its height
        """
        for rect in rects:
            height = rect.get_height()
            plt.text(rect.get_x() + rect.get_width()/2., 1.05*height,
                    '%d' % int(height),
                    ha='center', va='bottom')
    
    autolabel(rects1)
    autolabel(rects2)
    
    plt.show()
    

    The result:

    0 讨论(0)
  • 2021-01-02 02:37

    I came across this problem a while ago and created a wrapper function that takes a 2D array and automatically creates a multi-barchart from it:

    Multi-category bar chart

    The code:

    import matplotlib.pyplot as plt
    import matplotlib.cm as cm
    import operator as o
    
    import numpy as np
    
    dpoints = np.array([['rosetta', '1mfq', 9.97],
               ['rosetta', '1gid', 27.31],
               ['rosetta', '1y26', 5.77],
               ['rnacomposer', '1mfq', 5.55],
               ['rnacomposer', '1gid', 37.74],
               ['rnacomposer', '1y26', 5.77],
               ['random', '1mfq', 10.32],
               ['random', '1gid', 31.46],
               ['random', '1y26', 18.16]])
    
    fig = plt.figure()
    ax = fig.add_subplot(111)
    
    def barplot(ax, dpoints):
        '''
        Create a barchart for data across different categories with
        multiple conditions for each category.
    
        @param ax: The plotting axes from matplotlib.
        @param dpoints: The data set as an (n, 3) numpy array
        '''
    
        # Aggregate the conditions and the categories according to their
        # mean values
        conditions = [(c, np.mean(dpoints[dpoints[:,0] == c][:,2].astype(float))) 
                      for c in np.unique(dpoints[:,0])]
        categories = [(c, np.mean(dpoints[dpoints[:,1] == c][:,2].astype(float))) 
                      for c in np.unique(dpoints[:,1])]
    
        # sort the conditions, categories and data so that the bars in
        # the plot will be ordered by category and condition
        conditions = [c[0] for c in sorted(conditions, key=o.itemgetter(1))]
        categories = [c[0] for c in sorted(categories, key=o.itemgetter(1))]
    
        dpoints = np.array(sorted(dpoints, key=lambda x: categories.index(x[1])))
    
        # the space between each set of bars
        space = 0.3
        n = len(conditions)
        width = (1 - space) / (len(conditions))
    
        # Create a set of bars at each position
        for i,cond in enumerate(conditions):
            indeces = range(1, len(categories)+1)
            vals = dpoints[dpoints[:,0] == cond][:,2].astype(np.float)
            pos = [j - (1 - space) / 2. + i * width for j in indeces]
            ax.bar(pos, vals, width=width, label=cond, 
                   color=cm.Accent(float(i) / n))
    
        # Set the x-axis tick labels to be equal to the categories
        ax.set_xticks(indeces)
        ax.set_xticklabels(categories)
        plt.setp(plt.xticks()[1], rotation=90)
    
        # Add the axis labels
        ax.set_ylabel("RMSD")
        ax.set_xlabel("Structure")
    
        # Add a legend
        handles, labels = ax.get_legend_handles_labels()
        ax.legend(handles[::-1], labels[::-1], loc='upper left')
    
    barplot(ax, dpoints)
    plt.show()
    

    If you're interested in what this function does and the logic behind it, here's a (shamelessly self-promoting) link to the blog post describing it.

    0 讨论(0)
提交回复
热议问题