问题
I Have a column with many flags that were parsed from a XML parser. Data looks like this:
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;
I have to create a table with all these column names to capture the flags. Like:
USERKEYED VALMATCH DEVICEVERIFIED EXCEPTION USERREGISTRD ASSOCIATE EXTERNAL GROSSGIVEN UMAPPED
Y N N N N Y N Y N
Y N N N N Y Y Y N
Y N N Y N Y N Y N
How can I capture values dynamically in SAS? Either in a DATA step or a PROC step?
Thanks in advance.
回答1:
Let's start with your example output data.
data expect ;
id+1;
length USERKEYED VALMATCH DEVICEVERIFIED EXCEPTION
USERREGISTRD ASSOCIATE EXTERNAL GROSSGIVEN UMAPPED $1 ;
input USERKEYED -- UMAPPED;
cards4;
Y N N N N Y N Y N
Y N N N N Y Y Y N
Y N N Y N Y N Y N
;;;;
Now we can recreate your example input data:
data have ;
do until (last.id);
set expect ;
by id ;
array flag _character_;
length string $200 ;
do _n_=1 to dim(flag);
string=catx(';',string,catx('=',vname(flag(_n_)),flag(_n_)));
end;
end;
keep id string;
run;
Which will look like this:
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=Y;GROSSGIVEN=Y;UMAPPED=N
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=Y;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N
So to process this we need to parse out the pairs from the variable STRING
into multiple observations with the individual pairs' values split into NAME
and VALUE
variables.
data middle ;
set have ;
do _n_=1 by 1 while(_n_=1 or scan(string,_n_,';')^=' ');
length name $32 ;
name = scan(scan(string,_n_,';'),1,'=');
value = scan(scan(string,_n_,';'),2,'=');
output;
end;
keep id name value ;
run;
Then we can use PROC TRANSPOSE
to convert those observations into variables.
proc transpose data=middle out=want (drop=_name_) ;
by id;
id name ;
var value ;
run;
回答2:
The data that you have is a series of name/value pairs, using a ;
as a delimiter. We can extract each name/value pair one at a time, and then parse those into values:
data tmp;
length my_string next_pair name value $200;
my_string = "USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;";
cnt = 1;
next_pair = scan(my_string,cnt,";");
do while (next_pair ne "");
name = scan(next_pair,1,"=");
value = scan(next_pair,2,"=");
output;
cnt = cnt + 1;
next_pair = scan(my_string,cnt,";");
end;
keep name value;
run;
Gives us:
name value
=================== =====
USERKEYED Y
VALMATCH N
DEVICEVERIFIED N
EXCEPTION N
USERREGISTRD N
ASSOCIATE Y
EXTERNAL N
GROSSGIVEN Y
UMAPPED N
We can then transpose the data so that the name is used for the column names:
proc transpose data=tmp out=want(drop=_name_);
id name;
var value;
run;
Which gives you the desired table.
回答3:
DATA <MY_DATASET>;
SET INPUT_DATASET;
USERKEYED = substr(input_column, find(input_column, 'USERKEYED=')+10,1);
VALMATCH = substr(input_column, find(input_column, 'VALMATCH=')+9,1);
DEVICEVERIFIED = substr(input_column, find(input_column, 'DEVICEVERIFIED=')+15,1);
EXCEPTION = substr(input_column, find(input_column, 'EXCEPTION=')+10,1);
USERREGISTRD = substr(input_column, find(input_column, 'USERREGISTRD=')+13,1);
ASSOCIATE = substr(input_column, find(input_column, 'ASSOCIATE=')+10,1); EXTERNAL = substr(input_column, find(input_column, 'EXTERNAL=')+9,1);
GROSSGIVEN = substr(input_column, find(input_column, 'GROSSGIVEN=')+11,1);
UMAPPED = substr(input_column, find(input_column, UMAPPED=')+8,1);
run;
回答4:
My answer is essentially in the first block of code, the rest is just explanation, one alternative and a nice tip.
Based on the answer you gave, the input data is already in a SAS data set, so that can be read to create a file of SAS code which can then be run using %include
and so proc transpose
is not required:
filename tempcode '<path><file-name.txt>'; /* set this up yourself */
/* write out SAS code to the fileref tempcode */
data _null_;
file tempcode;
set have;
if _n_=1 then
put 'Y="Y"; N="N"; drop Y N;';
put input_column;
put 'output;';
run;
/* %include the code to create the desired output */
data want;
%include tempcode;
run;
As the input data already almost looks like SAS assignment statements, we have taken advantage of that and so the SAS code that has been run from fileref tempcode using %include
should look like:
Y="Y"; N="N"; drop Y N;
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;
output;
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=Y;GROSSGIVEN=Y;UMAPPED=N;
output;
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=Y;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;
output;
As an alternative, fileref tempcode could contain all of the code for data step "data want;
":
/* write out entire SAS data step code to the fileref tempcode */
data _null_;
file tempcode;
set have end=lastrec;
if _n_=1 then
put 'data want;'
/'Y="Y"; N="N"; drop Y N;';
put input_column;
put 'output;';
if lastrec then
put 'run;';
run;
%include tempcode; /* no need for surrounding SAS code */
As a tip, to see the code being processed by %include
in the log you can use the following variation:
%include tempcode / source2;
Hope this all helps (and works!)
Regards, Amir.
来源:https://stackoverflow.com/questions/33698919/sas-dynamically-create-column-names-using-the-values-from-another-column