Histogram using Excel FREQUENCY function

后端 未结 2 1855
伪装坚强ぢ
伪装坚强ぢ 2021-01-22 00:37

In Excel 2010, I have a list of values in column A and a bin size is specified in B1. This allows me to create histograms with N bins using this formul

2条回答
  •  太阳男子
    2021-01-22 01:05

    (This is fairly different in approach to the macro-driven dynamic range-resizing thing, so I'm using a separate answer...)

    A dynamic histogram chart can be built by remembering that "named ranges" are actually named formulas, so their values may be dynamic, extremely so in some cases.

    Let's start with the assumption that we have an arbitrary set of values in column A, starting at row 1 and also that we have another cell that contains the number of bins we want in our histogram. In my workbook that happens to be E2. So we fire up the Name Manager (on the "Formulas" tab) and create

    num_bins             =Sheet1!$E$2
    

    I've gone for defining a number of bins, rather than a bin size (which we'll define later) because the latter makes it tricky to know exactly how to set our bin boundaries: are we happy with the idea that the first and last bins may cover different-sized parts of the range of values, for example?*

    We can also set up dynamic formulas to describe our data:

    data_count           =COUNT(Sheet1!$A:$A)
    data_vals            =OFFSET(Sheet1!$A$1,0,0,data_count,1)
    max_val              =MAX(data_vals)
    min_val              =MIN(data_vals)
    

    With those defined, we can get fancy. How big should each bin be? Make another named formula:

    bin_size             =(max_val-min_val)/(num_bins)
    

    And here comes the science: these formulas make the dynamic arrays:

    bin_array            =min_val+ROW(OFFSET(Sheet1!$A$1,0,0,num_bins-1,1))*bin_size
    bin_labels           =min_val+ROW(OFFSET(Sheet1!$A$1,0,0,num_bins,1))*bin_size        
    data_vals            =FREQUENCY(data_vals,bin_array)
    

    The first one is the trickier: it uses the row numbers of a num_bins minus one-size range to generate multiple of bin_size. It doesn't start the array at min_val because the FREQUENCY() function counts items up to each bin value. It's one smaller than the number of bins desired because the function produces an array one larger, where the final entry has the points above the highest bin number. So we make a separate bin_labels array for presentation purposes.

    Now we can make a chart. Insert a (say) a 2-D column chart and open the "Select Data" dialog (either from the ribbon or right-clicking the chart). Add a new series, setting Series values to =Sheet1!freq_array. It's necessary to include either the sheet name or the workbook name to get this to work. Add a series name if you like and click "OK". Now click "Edit" for "Horizontal (Category) Axis Labels" and set the range to =Sheet1!bin_labels.

    Here's 2000 cells with =RAND()*5 and 5 bins (I listed the names and their formulas, with values where they don't produce arrays)

    2000 <code>=RAND()*5</code> results into 5 bins

    And the same sheet after changing num_bins to 10. (The RAND() formulas recalculated, so the bins may not add up to exactly the same values)

    After changing num_bins to 10

    • (if you must have a user-defined bin size, you'll need to make bin_size the sheet reference and calculate num_bins with a named formula)

提交回复
热议问题