Query to display largest n% of values and group the rest as “other”

前端 未结 1 1713
慢半拍i
慢半拍i 2021-01-17 06:26

I see someone posted something very similar to what I am looking to do in Access 2010.

Grouping of top 80% categories

I saw the response but am confused as

相关标签:
1条回答
  • 2021-01-17 07:03

    When performing calculations like this we need to be mindful of what happens when there is a tie, so let's use the following [VendorData]

    vendors  lbs 
    -------  ----
    A1       2000
    A2       3000
    A3        200
    A4        800
    A5       2000
    A6       2000
    

    We can start by creating the following saved query named [VendorPct] in Access

    SELECT 
        vendors, 
        lbs, 
        lbs_sum,
        lbs / lbs_sum * 100 AS lbs_pct
    FROM
        (
            SELECT vendors, lbs, lbs_sum
            FROM
                VendorData,
                (
                    SELECT Sum(lbs) AS lbs_sum FROM VendorData
                )
        )
    

    It gives us

    vendors  lbs   lbs_sum  lbs_pct
    -------  ----  -------  -------
    A1       2000    10000       20
    A2       3000    10000       30
    A3        200    10000        2
    A4        800    10000        8
    A5       2000    10000       20
    A6       2000    10000       20
    

    Now we can create a saved query in Access named [VendorPctCumulative]

    SELECT 
        vendors, 
        Max(lbs) AS lbs_, 
        Max(lbs_pct) as lbs_pct_, 
        Sum(lbs_pct_other) AS lbs_pct_cumulative_
    FROM
        (
                SELECT 
                    vendors, 
                    lbs, 
                    lbs_pct, 
                    lbs_pct AS lbs_pct_other 
                FROM VendorPct
            UNION ALL
                (
                    SELECT 
                        v1.vendors, 
                        v1.lbs,
                        v1.lbs_pct,
                        v2.lbs_pct AS lbs_pct_other 
                    FROM
                        VendorPct v1
                        INNER JOIN
                        VendorPct v2
                            ON (v2.lbs = v1.lbs AND v2.vendors < v1.vendors)
                                OR v2.lbs > v1.lbs
                )
        )
    GROUP BY vendors
    ORDER BY Sum(lbs_pct_other), vendors
    

    That produces

    vendors  lbs_  lbs_pct_  lbs_pct_cumulative_
    -------  ----  --------  -------------------
    A2       3000        30                   30
    A1       2000        20                   50
    A5       2000        20                   70
    A6       2000        20                   90
    A4        800         8                   98
    A3        200         2                  100
    

    Now one more saved query named [VendorPctCumulativeThreshold] to find the first cumulative percent that meets or exceeds the 80% threshold:

    SELECT TOP 1 lbs_pct_cumulative_ 
    FROM 
        (
            SELECT lbs_pct_cumulative_ 
            FROM VendorPctCumulative
            WHERE lbs_pct_cumulative_ >= 80
            ORDER BY lbs_pct_cumulative_
        )
    

    i.e.,

    lbs_pct_cumulative_
    -------------------
                     90
    

    and we can put it all together with

        SELECT 
            vendors, 
            lbs_ AS lbs, 
            lbs_pct_ AS lbs_pct
        FROM
            VendorPctCumulative vpc
            INNER JOIN
            VendorPctCumulativeThreshold vpct
                ON vpc.lbs_pct_cumulative_ <= vpct.lbs_pct_cumulative_
    UNION ALL
        SELECT 
            "other" AS vendors, 
            Sum(lbs_) AS lbs, 
            Sum(lbs_pct_) AS lbs_pct
        FROM
            VendorPctCumulative vpc
            INNER JOIN
            VendorPctCumulativeThreshold vpct
                ON vpc.lbs_pct_cumulative_ > vpct.lbs_pct_cumulative_
    ORDER BY 3 DESC, 1
    

    producing

    vendors  lbs   lbs_pct
    -------  ----  -------
    A2       3000       30
    A1       2000       20
    A5       2000       20
    A6       2000       20
    other    1000       10
    
    0 讨论(0)
提交回复
热议问题