Compare and keep character observations that are the same

偶尔善良 提交于 2019-12-12 00:07:19

问题


I have the following data set:

Date    ID    Company        
Jan05   1     Coca-Cola      
Jan05   2     Coca-Cola      
Jan05   3     Coca-Cola          
Jan05   4     Apple          
Jan05   5     Apple          
Jan05   6     Apple
Jan05   7     Microsoft     
Feb05   1     McDonald       
Feb05   2     McDonald       
Feb05   3     McDonald
Feb05   4     McDonald       
Feb05   5     McDonald       
Feb05   6     Microsoft        
 .
 .
 .
Jan06   1     Apple      
Jan06   2     Apple     
Jan06   3     Apple          
Jan06   4     Apple          
Jan06   5     Apple          
Jan06   6     Apple
Jan06   7     Apple     
Feb06   1     McDonald       
Feb06   2     McDonald       
Feb06   3     McDonald
Feb06   4     McDonald       
Feb06   5     McDonald       
Feb06   6     Lenova  
Feb06   7     Lenova       
 .
 .
Jan07   1     Apple      
Jan07   2     Apple     
Jan07   3     Apple          
Jan07   4     Microsoft          
Jan07   5     Lenovo          
Jan07   6     Apple
Jan07   7     Apple     
Feb07   1     TJmax       
Feb07   2     TJMax       
Feb07   3     TJMax
Feb07   4     TJMax       
Feb07   5     TJMax       
Feb07   6     TJMax  
Feb07   7     TJMax          
.
.
.
.
until July15

What I want to do are the following: 1: Compare January 05 with January 06, then January 06 with January 07...February 05 with February 06, February 06 with February 07....so on for each month get compute a median for ID when the same companies are present for both dates. 2: I don't want a new dataset each time I compute a median for ID. I merely want to make sure that both companies are present for lets say in Jan05 and Jan06, then compute a median for ID.

Whats the best way to do this in SAS?

My end result will look like this:

Date    Median_ID    
Jan05      2         
Jan06      4

Jan06      4     
Jan07      3

Feb05      3     
Feb06      3

Feb06      0
Feb07      0

As you can see from the result: In Jan05 and 06, the only company that matches is Apple. In Jan06 and Jan07, the only company that matches again is Apple. So we take the median of ID for the time the companies match.


回答1:


It isn't clear how you've calculated the end results from your sample data - it would be easier to follow your explanation if you included all the intermediate steps for one month, e.g. Jan05. However, this seems like something that you could approach with some SQL similar to the following:

data have;
input Date monyy5. ID Company $32.;
format Date monyy5.;
cards;
Jan05   1     Coca-Cola      
Jan05   2     Coca-Cola      
Jan05   3     Coca-Cola          
Jan05   4     Apple          
Jan05   5     Apple          
Jan05   6     Apple
Jan05   7     Microsoft     
Feb05   1     McDonald       
Feb05   2     McDonald       
Feb05   3     McDonald
Feb05   4     McDonald       
Feb05   5     McDonald       
Feb05   6     Microsoft        
Jan06   1     Apple      
Jan06   2     Apple     
Jan06   3     Apple          
Jan06   4     Apple          
Jan06   5     Apple          
Jan06   6     Apple
Jan06   7     Apple     
Feb06   1     McDonald       
Feb06   2     McDonald       
Feb06   3     McDonald
Feb06   4     McDonald       
Feb06   5     McDonald       
Feb06   6     Lenova  
Feb06   7     Lenova       
Jan07   1     Apple      
Jan07   2     Apple     
Jan07   3     Apple          
Jan07   4     Microsoft          
Jan07   5     Lenovo          
Jan07   6     Apple
Jan07   7     Apple     
Feb07   1     TJmax       
Feb07   2     TJMax       
Feb07   3     TJMax
Feb07   4     TJMax       
Feb07   5     TJMax       
Feb07   6     TJMax  
Feb07   7     TJMax   
;
run;

proc sql;
    create table want as
        select a.date, median(a.ID) as Median_ID from have a inner join have b 
            on  month(a.date)= month(b.date) 
            and year(a.date) = year(b.date) - 1
            and a.ID         = b.ID
            and a.company    = b.company
        group by a.date
        ;
quit;   


来源:https://stackoverflow.com/questions/31661121/compare-and-keep-character-observations-that-are-the-same

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!