How to use proc compare to update dataset

只愿长相守 提交于 2020-01-04 06:03:45

问题


I want to use proc compare to update dataset on a daily basis.

work.HAVE1

Date        Key Var1 Var2 
01Aug2013   K1   a    2
01Aug2013   K2   a    3
02Aug2013   K1   b    4

work.HAVE2

Date        Key Var1 Var2 
01Aug2013   K1   a    3
01Aug2013   K2   a    3
02Aug2013   K1   b    4
03Aug2013   K2   c    1

Date and Key are uniquely determine one record. How can I use the above two tables to construct the following

work.WANT

Date        Key Var1 Var2 
01Aug2013   K1   a    3
01Aug2013   K2   a    3
02Aug2013   K1   b    4
03Aug2013   K2   c    1

I don't want to delete the previous data and then rebuild it. I want to modify it via append new records at the bottom and adjust the values in VAR1 or VAR2. I'm struggling with proc compare but it just doesn't return what I want.


回答1:


proc compare base=work.HAVE1 compare=work.HAVE2 out=WORK.DIFF outnoequal outcomp;
id Date Key;
run;

This will give you new and changed (unequal records) in single dataset WORK.DIFF. You'll have to distinguish new vs changed yourself.

However, what you want to achieve is actually a MERGE - inserts new, overwrites existing, though maybe due to performance reasons etc. you don't want to re-create the full table.

data work.WANT;
    merge work.HAVE1 work.HAVE2;
    by Date Key;
run;

Edit1:

/* outdiff option will produce records with _type_ = 'DIF' for matched keys */
proc compare base=work.HAVE1 compare=work.HAVE2 out=WORK.RESULT outnoequal outcomp outdiff;
id Date Key;
run;


data WORK.DIFF_KEYS;  /* keys of changed records */
    set WORK.RESULT;
    where _type_ = 'DIF';
    keep Date Key;
run;

/* split NEW and CHANGED */
data
    WORK.NEW
    WORK.CHANGED
;
 merge
      WORK.RESULT (where=( _type_ ne 'DIF'));
        WORK.DIFF_KEYS (in = d)
    ;
  by Date Key;
  if d then output WORK.CHANGED;
  else output WORK.NEW;
run;

Edit2:

Now you can just APPEND the WORK.NEW to target table.

For WORK.CHANGED - either use MODIFY or UPDATE statement to update the records. Depending on the size of the changes, you can also think about PROC SQL; DELETE to delete old records and PROC APPEND to add new values.




回答2:


All a PROC COMPARE will do will tell you the differences between 2 datasets. To achieve your goal you need to use an UPDATE statement in a data step. This way, values in HAVE1 are updated with HAVE2 where the date and key match, or a new record inserted if there are no matches.

data have1;
input Date :date9. Key $ Var1 $ Var2;
format date date9.;
datalines;
01Aug2013   K1   a    2
01Aug2013   K2   a    3
02Aug2013   K1   b    4
;
run;

data have2;
input Date :date9. Key $ Var1 $ Var2;
format date date9.;
datalines;
01Aug2013   K1   a    3
01Aug2013   K2   a    3
02Aug2013   K1   b    4
03Aug2013   K2   c    1
;
run;

data want;
update have1 have2;
by date key;
run;


来源:https://stackoverflow.com/questions/26420777/how-to-use-proc-compare-to-update-dataset

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!