Summing a Column By Group In a Dataset With Macros

前端未结

关注

 3  427

I have a dataset that looks like:

 Month   Cost_Center      Account    Actual    Annual_Budget
 June     53410           Postage       13      234
 June     5342


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  广开言路        
                
              
                            
                2021-01-27 12:10
              
            
            
                                                                       
SAS DATA step behavior is quite complex ("About DATA Step Execution" in SAS Language Reference: Concepts). The default behavior, that you're seeing, is: at the end of each iteration (i.e. for each input row) the row is written to the output data set, and the PDV - all data step variables - is reset.

You can't expect to write Base SAS "intuitively" without spending a few days learning it first, so I recommend using PROC SQL, unless you have a reason not to.

If you really want to aggregate in data step, you have to use something called BY groups processing: after ensuring the input data set is sorted by the BY vars, you can use something like the following:

data Test (keep = Month Account Sum_Actual Sum_Annual /*...your Run_Rate and Over_Budget_Alarm...*/);
    set Combined; /* the input table */
    by Month Account; /* must be sorted by these */

    retain Sum_Actual Sum_Annual; /* don't clobber for each input row */
    if first.account then do; /* instead do it manually for each group */
        Sum_Actual = 0;
        Sum_Annual = 0;
    end;

    /* accumulate the values from each row */
    Sum_Actual = sum(Sum_Actual, Actual);
    Sum_Annual = sum(Sum_Annual, Annual_Budget);
    /* Note that Sum_Actual = Sum_Actual+Actual; will not work if any of the input values is 'missing'. */

    if last.account then do;
        /* The group has been processed.
           Do any additional processing for the group as a whole, e.g.
           calculate Over_Budget_Alarm. */
        output; /* write one output row per group */
    end;
run;

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  盖世英雄少女心        
                
              
                            
                2021-01-27 12:12
              
            
            
                                                                       
Proc SQL can be very effective for understanding aggregate data examination.  With out seeing what the macros do, I would say perform the run rate checks after outputting data set test.

You don't show rows for other months, but I must presume the annual_budget values are constant across all months -- if so, I don't see a reason to ever sum annual_budget; comparing anything to sum(annual_budget) is probably at the incorrect time scale and not useful.

From the show data its hard to tell if you want to know any of these


which (or if some) months had a run_rate that exceeded the annual_budget
which (or if some) months run_rate exceeded the balance of annual_budget (i.e. the annual_budget less the prior months expenditure)


Presume each row in test is for a single year/month/costCenter/account -- if not the underlying data would have to be aggregated to that level.

Proc SQL;
  * retrieve presumed constant annual_budget values from data;
  * this information might (should) already exist in another table;

  * presume constant annual budget value at each cost center | account combination;
  * distinct because there are multiple months with the same info;

  create table annual_budgets as
  select distinct Cost_Center, Account, Annual_Budget
  from test;

  create table account_budgets as
  select account, sum(annual_budget) as annual_budget 
  from annual_budgets
  group by account;

  * flag for some run rate condition;

  create table annual_budget_mon_runrate_check as
  select 
    2019 as year,
    account,
    sum(actual) as yr_actual,  /* across all month/cost center */
    min (
      select annual_budget from account_budgets as inner
      where inner.account = outer.account
    ) as account_budget,

    max (
      case when actual * 12 > annual_budget then 1 else 0 end
    ) as
      excessive_runrate_flag label="At least one month had a cost center run rate that would exceed its annual_budget")
  from 
    test as outer
  group by
    year, account;


You can add a where clause to restrict the accounts processed.

Changing the max to sum in the flag computation would return the number of cost center months with excessive run rates.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  情深已故        
                
              
                            
                2021-01-27 12:16
              
            
            
                                                                       

Use PROC MEANS to summarize the data
Use a data step and IF/THEN statement to create your flags.

proc means data=have N SUM NWAY STACKODS;
   class account;
   var amount annual_budget;
   ods output summary = summary_stats1;
   output out = summary_stats2 N = SUM= / AUTONAME;
run;

data want;
  set summary_stats;
  if sum_actual > sum_annual_budget then flag=1; 
  else flag=0;
run;


                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复