SAS - Dynamically create column names using the values from another column

半世苍凉 提交于 2019-12-23 16:01:29

问题


I Have a column with many flags that were parsed from a XML parser. Data looks like this:

USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;

I have to create a table with all these column names to capture the flags. Like:

USERKEYED VALMATCH DEVICEVERIFIED EXCEPTION USERREGISTRD ASSOCIATE EXTERNAL GROSSGIVEN UMAPPED 
Y  N N N N Y N Y N 
Y  N N N N Y Y Y N 
Y  N N Y N Y N Y N 

How can I capture values dynamically in SAS? Either in a DATA step or a PROC step?

Thanks in advance.


回答1:


Let's start with your example output data.

data expect ;
  id+1;
  length USERKEYED VALMATCH DEVICEVERIFIED EXCEPTION 
         USERREGISTRD ASSOCIATE EXTERNAL GROSSGIVEN UMAPPED $1 ;
  input USERKEYED -- UMAPPED;
cards4;
Y  N N N N Y N Y N 
Y  N N N N Y Y Y N 
Y  N N Y N Y N Y N 
;;;;

Now we can recreate your example input data:

data have ;
  do until (last.id);
    set expect ;
    by id ;
    array flag _character_;
    length string $200 ;
    do _n_=1 to dim(flag);
      string=catx(';',string,catx('=',vname(flag(_n_)),flag(_n_)));
    end;
  end;
  keep id string;
run;

Which will look like this:

USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N 
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=Y;GROSSGIVEN=Y;UMAPPED=N 
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=Y;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N 

So to process this we need to parse out the pairs from the variable STRING into multiple observations with the individual pairs' values split into NAME and VALUE variables.

data middle ;
  set have ;
  do _n_=1 by 1 while(_n_=1 or scan(string,_n_,';')^=' ');
    length name $32 ;
    name = scan(scan(string,_n_,';'),1,'=');
    value = scan(scan(string,_n_,';'),2,'=');
    output;
  end;
  keep id name value ;
run;

Then we can use PROC TRANSPOSE to convert those observations into variables.

proc transpose data=middle out=want (drop=_name_) ;
  by id;
  id name ;
  var value ;
run;



回答2:


The data that you have is a series of name/value pairs, using a ; as a delimiter. We can extract each name/value pair one at a time, and then parse those into values:

data tmp;
  length my_string next_pair name value $200;
  my_string = "USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;";
  cnt = 1;
  next_pair = scan(my_string,cnt,";");
  do while (next_pair ne "");
    name = scan(next_pair,1,"=");
    value = scan(next_pair,2,"=");
    output;
    cnt = cnt + 1;
    next_pair = scan(my_string,cnt,";");
  end;
  keep name value;
run;

Gives us:

name                value
=================== =====
USERKEYED           Y
VALMATCH            N
DEVICEVERIFIED      N
EXCEPTION           N
USERREGISTRD        N
ASSOCIATE           Y
EXTERNAL            N
GROSSGIVEN          Y
UMAPPED             N

We can then transpose the data so that the name is used for the column names:

proc transpose data=tmp out=want(drop=_name_);
  id name;
  var value;
run;

Which gives you the desired table.




回答3:


DATA <MY_DATASET>;
SET INPUT_DATASET;
USERKEYED = substr(input_column, find(input_column, 'USERKEYED=')+10,1);
VALMATCH = substr(input_column, find(input_column, 'VALMATCH=')+9,1);
DEVICEVERIFIED = substr(input_column, find(input_column, 'DEVICEVERIFIED=')+15,1);
EXCEPTION = substr(input_column, find(input_column, 'EXCEPTION=')+10,1);
USERREGISTRD = substr(input_column, find(input_column, 'USERREGISTRD=')+13,1);
ASSOCIATE = substr(input_column, find(input_column, 'ASSOCIATE=')+10,1); EXTERNAL = substr(input_column, find(input_column, 'EXTERNAL=')+9,1);
GROSSGIVEN = substr(input_column, find(input_column, 'GROSSGIVEN=')+11,1);
UMAPPED = substr(input_column, find(input_column, UMAPPED=')+8,1);
run; 



回答4:


My answer is essentially in the first block of code, the rest is just explanation, one alternative and a nice tip.

Based on the answer you gave, the input data is already in a SAS data set, so that can be read to create a file of SAS code which can then be run using %include and so proc transpose is not required:

filename tempcode '<path><file-name.txt>'; /* set this up yourself */

/* write out SAS code to the fileref tempcode */
data _null_;
  file tempcode;
  set have;
  if _n_=1 then
    put 'Y="Y"; N="N"; drop Y N;';
  put input_column;
  put 'output;';
run;

/* %include the code to create the desired output */
data want;
  %include tempcode;
run;

As the input data already almost looks like SAS assignment statements, we have taken advantage of that and so the SAS code that has been run from fileref tempcode using %include should look like:

Y="Y"; N="N"; drop Y N;
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;
output;
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=N;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=Y;GROSSGIVEN=Y;UMAPPED=N;
output;
USERKEYED=Y;VALMATCH=N;DEVICEVERIFIED=N;EXCEPTION=Y;USERREGISTRD=N;ASSOCIATE=Y;EXTERNAL=N;GROSSGIVEN=Y;UMAPPED=N;
output;

As an alternative, fileref tempcode could contain all of the code for data step "data want;":

/* write out entire SAS data step code to the fileref tempcode */
data _null_;
  file tempcode;
  set have end=lastrec;
  if _n_=1 then
    put 'data want;'
       /'Y="Y"; N="N"; drop Y N;';

  put input_column;
  put 'output;';

  if lastrec then
    put 'run;';
run;

%include tempcode; /* no need for surrounding SAS code */

As a tip, to see the code being processed by %include in the log you can use the following variation:

%include tempcode / source2;

Hope this all helps (and works!)

Regards, Amir.



来源:https://stackoverflow.com/questions/33698919/sas-dynamically-create-column-names-using-the-values-from-another-column

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!