Histogram using Excel FREQUENCY function

后端 未结 2 1849
伪装坚强ぢ
伪装坚强ぢ 2021-01-22 00:37

In Excel 2010, I have a list of values in column A and a bin size is specified in B1. This allows me to create histograms with N bins using this formul

相关标签:
2条回答
  • 2021-01-22 00:59

    The only answer I can think of is to use a macro to resize the output range of your formula.

    Here's a simple snippet that illustrates the idea.

    Dim result As Variant
    Dim targetCols As Long
    
    result = Evaluate(fmla)
    With rng
      targetCols = UBound(result, 1) - LBound(result, 1) + 1
      .Resize(1, targetCols).FormulaArray = fmla
    End With
    

    I wrote about a more complete implementation last year - more error-tolerant , 2-dimensional outputs, etc

    EDIT: However... The formula you're using won't work with this approach: it relies on its output range size being known at input. Here's an alternative suggestion that can be automatically resized:

    We can create a set of workable bins with something like this:

    ={(ROW(OFFSET(A1,0,0,CEILING((MAX(A:A)-MIN(A:A))/B1,1)+1,1))-1)*B1}
    

    where as before this is the number of bins

    CEILING((MAX(A:A)-MIN(A:A))/B1,1)+1
    

    which we then use to create a range using OFFSET() (the target doesn't matter since we're not using its values). Then we take the ROW() of each cell in the range (subtracting 1 to get a set of values starting with zero) and multiply by our bin size. You may want to shift the range of values (by adding MIN(A:A) for example).

    The big difference is that this formula doesn't need to be input across a range for the Evaluate() VBA function to be able to produce a range output.

    To get the histogram, either plug the output from the bin formula into FREQUENCY() or drop in the whole formula. The auto-resizing should work either way.

    If you were particularly opposed to running a macro (I have it available via a custom Ribbon button and a hotkey combination) then you could use a Worksheet_Change event to watch for opportunities to apply it. I can't say for sure if that would have any unpleasant side-effects.

    0 讨论(0)
  • 2021-01-22 01:05

    (This is fairly different in approach to the macro-driven dynamic range-resizing thing, so I'm using a separate answer...)

    A dynamic histogram chart can be built by remembering that "named ranges" are actually named formulas, so their values may be dynamic, extremely so in some cases.

    Let's start with the assumption that we have an arbitrary set of values in column A, starting at row 1 and also that we have another cell that contains the number of bins we want in our histogram. In my workbook that happens to be E2. So we fire up the Name Manager (on the "Formulas" tab) and create

    num_bins             =Sheet1!$E$2
    

    I've gone for defining a number of bins, rather than a bin size (which we'll define later) because the latter makes it tricky to know exactly how to set our bin boundaries: are we happy with the idea that the first and last bins may cover different-sized parts of the range of values, for example?*

    We can also set up dynamic formulas to describe our data:

    data_count           =COUNT(Sheet1!$A:$A)
    data_vals            =OFFSET(Sheet1!$A$1,0,0,data_count,1)
    max_val              =MAX(data_vals)
    min_val              =MIN(data_vals)
    

    With those defined, we can get fancy. How big should each bin be? Make another named formula:

    bin_size             =(max_val-min_val)/(num_bins)
    

    And here comes the science: these formulas make the dynamic arrays:

    bin_array            =min_val+ROW(OFFSET(Sheet1!$A$1,0,0,num_bins-1,1))*bin_size
    bin_labels           =min_val+ROW(OFFSET(Sheet1!$A$1,0,0,num_bins,1))*bin_size        
    data_vals            =FREQUENCY(data_vals,bin_array)
    

    The first one is the trickier: it uses the row numbers of a num_bins minus one-size range to generate multiple of bin_size. It doesn't start the array at min_val because the FREQUENCY() function counts items up to each bin value. It's one smaller than the number of bins desired because the function produces an array one larger, where the final entry has the points above the highest bin number. So we make a separate bin_labels array for presentation purposes.

    Now we can make a chart. Insert a (say) a 2-D column chart and open the "Select Data" dialog (either from the ribbon or right-clicking the chart). Add a new series, setting Series values to =Sheet1!freq_array. It's necessary to include either the sheet name or the workbook name to get this to work. Add a series name if you like and click "OK". Now click "Edit" for "Horizontal (Category) Axis Labels" and set the range to =Sheet1!bin_labels.

    Here's 2000 cells with =RAND()*5 and 5 bins (I listed the names and their formulas, with values where they don't produce arrays)

    2000 <code>=RAND()*5</code> results into 5 bins

    And the same sheet after changing num_bins to 10. (The RAND() formulas recalculated, so the bins may not add up to exactly the same values)

    After changing num_bins to 10

    • (if you must have a user-defined bin size, you'll need to make bin_size the sheet reference and calculate num_bins with a named formula)
    0 讨论(0)
提交回复
热议问题