Summing a Column By Group In a Dataset With Macros

前端 未结 3 428
说谎
说谎 2021-01-27 11:34

I have a dataset that looks like:

 Month   Cost_Center      Account    Actual    Annual_Budget
 June     53410           Postage       13      234
 June     5342         


        
3条回答
  •  盖世英雄少女心
    2021-01-27 12:12

    Proc SQL can be very effective for understanding aggregate data examination. With out seeing what the macros do, I would say perform the run rate checks after outputting data set test.

    You don't show rows for other months, but I must presume the annual_budget values are constant across all months -- if so, I don't see a reason to ever sum annual_budget; comparing anything to sum(annual_budget) is probably at the incorrect time scale and not useful.

    From the show data its hard to tell if you want to know any of these

    • which (or if some) months had a run_rate that exceeded the annual_budget
    • which (or if some) months run_rate exceeded the balance of annual_budget (i.e. the annual_budget less the prior months expenditure)

    Presume each row in test is for a single year/month/costCenter/account -- if not the underlying data would have to be aggregated to that level.

    Proc SQL;
      * retrieve presumed constant annual_budget values from data;
      * this information might (should) already exist in another table;
    
      * presume constant annual budget value at each cost center | account combination;
      * distinct because there are multiple months with the same info;
    
      create table annual_budgets as
      select distinct Cost_Center, Account, Annual_Budget
      from test;
    
      create table account_budgets as
      select account, sum(annual_budget) as annual_budget 
      from annual_budgets
      group by account;
    
      * flag for some run rate condition;
    
      create table annual_budget_mon_runrate_check as
      select 
        2019 as year,
        account,
        sum(actual) as yr_actual,  /* across all month/cost center */
        min (
          select annual_budget from account_budgets as inner
          where inner.account = outer.account
        ) as account_budget,
    
        max (
          case when actual * 12 > annual_budget then 1 else 0 end
        ) as
          excessive_runrate_flag label="At least one month had a cost center run rate that would exceed its annual_budget")
      from 
        test as outer
      group by
        year, account;
    

    You can add a where clause to restrict the accounts processed.

    Changing the max to sum in the flag computation would return the number of cost center months with excessive run rates.

提交回复
热议问题