问题
I want to use proc compare
to update dataset on a daily basis.
work.HAVE1
Date Key Var1 Var2
01Aug2013 K1 a 2
01Aug2013 K2 a 3
02Aug2013 K1 b 4
work.HAVE2
Date Key Var1 Var2
01Aug2013 K1 a 3
01Aug2013 K2 a 3
02Aug2013 K1 b 4
03Aug2013 K2 c 1
Date
and Key
are uniquely determine one record.
How can I use the above two tables to construct the following
work.WANT
Date Key Var1 Var2
01Aug2013 K1 a 3
01Aug2013 K2 a 3
02Aug2013 K1 b 4
03Aug2013 K2 c 1
I don't want to delete the previous data and then rebuild it. I want to modify
it via append new records at the bottom and adjust the values in VAR1
or VAR2
.
I'm struggling with proc compare
but it just doesn't return what I want.
回答1:
proc compare base=work.HAVE1 compare=work.HAVE2 out=WORK.DIFF outnoequal outcomp;
id Date Key;
run;
This will give you new and changed (unequal records) in single dataset WORK.DIFF. You'll have to distinguish new vs changed yourself.
However, what you want to achieve is actually a MERGE
- inserts new, overwrites existing, though maybe due to performance reasons etc. you don't want to re-create the full table.
data work.WANT;
merge work.HAVE1 work.HAVE2;
by Date Key;
run;
Edit1:
/* outdiff option will produce records with _type_ = 'DIF' for matched keys */
proc compare base=work.HAVE1 compare=work.HAVE2 out=WORK.RESULT outnoequal outcomp outdiff;
id Date Key;
run;
data WORK.DIFF_KEYS; /* keys of changed records */
set WORK.RESULT;
where _type_ = 'DIF';
keep Date Key;
run;
/* split NEW and CHANGED */
data
WORK.NEW
WORK.CHANGED
;
merge
WORK.RESULT (where=( _type_ ne 'DIF'));
WORK.DIFF_KEYS (in = d)
;
by Date Key;
if d then output WORK.CHANGED;
else output WORK.NEW;
run;
Edit2:
Now you can just APPEND
the WORK.NEW to target table.
For WORK.CHANGED - either use MODIFY
or UPDATE
statement to update the records.
Depending on the size of the changes, you can also think about PROC SQL; DELETE
to delete old records and PROC APPEND
to add new values.
回答2:
All a PROC COMPARE will do will tell you the differences between 2 datasets. To achieve your goal you need to use an UPDATE statement in a data step. This way, values in HAVE1 are updated with HAVE2 where the date and key match, or a new record inserted if there are no matches.
data have1;
input Date :date9. Key $ Var1 $ Var2;
format date date9.;
datalines;
01Aug2013 K1 a 2
01Aug2013 K2 a 3
02Aug2013 K1 b 4
;
run;
data have2;
input Date :date9. Key $ Var1 $ Var2;
format date date9.;
datalines;
01Aug2013 K1 a 3
01Aug2013 K2 a 3
02Aug2013 K1 b 4
03Aug2013 K2 c 1
;
run;
data want;
update have1 have2;
by date key;
run;
来源:https://stackoverflow.com/questions/26420777/how-to-use-proc-compare-to-update-dataset