问题
This question is related to: prior question link
I have a JSON file that looks like:
[
{
"rxnorm_id": "999999999",
"drug_name": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"plans": [
{
"plan_id_type": "xxxxxxxxxxxxx",
"plan_id": "999999999999999",
"drug_tier": "xxxxxxxxxxxxxxx",
"prior_authorization": false,
"step_therapy": false,
"quantity_limit": false
},
I am able to import every line into SAS that has 'rxnorm_id and drug_name using this code:
>
filename data url 'http://stg-oh-medicaid.molinahealthcare.com/JSON/Drugs_Molina_Healthcare.json';
data formularies;
infile data lrecl = 32000 truncover scanover;
input @'"rxnorm_id": "' rxnorm_id $255.
@'"drug_name": "' drug_name $255.
@'"plan_id_type": "' plan_id_type $255.
@'"plan_id": "' plan_id $255.
@'"drug_tier": "' drug_tier $255.
@'"prior_authorization": ' prior_authorization $255.
@'"step_therapy": ' step_therapy $255.
@'"quantity_limit": ' quantity_limit $255.;
rxnorm_id = scan(rxnorm_id,1,'",');
drug_name = scan(drug_name,1,'",');
plan_id_type = scan(plan_id_type,1,'",');
plan_id = scan(plan_id,1,'",');
drug_tier = scan(drug_tier,1,'",');
prior_authorization = scan(prior_authorization,1,'",');
step_therapy = scan(step_therapy,1,'",');
quantity_limit = scan(quantity_limit,1,'",');
run;
But, I want to pick up all of the values in the 'plans' nest that are in between the rxnorm and drug name values. Someone suggested using the OUTPUT option in SAS to see the missing rows. Anyone got a good fix to my code to do this?
Thanks
回答1:
As of 9.4, the best way to parse JSON in SAS is using PROC GROOVY. That is what I recommend. You can also do it with DS2. If you are adventurous, and on 9.4m3, you can also use PROC LUA. That is what I would try, since it allows you to manipulate SAS datasets easily.
That being said, if you can rely on the simple structure of your example, then you can select only the lines that have fields and output them in the format you wanted using regular expressions in data step:
data want;
infile 'c:/tmp/json_snippet.txt';
length field $20 data $100;
keep field data;
retain re;
input;
if _n_ = 1 then do;
re = prxparse('/"(.*?)": "?(true|false|.*?(?="))/');
end;
if prxmatch(re,_infile_); /* grep only matching lines */
call prxposn(re,1,start,len);
field = substr(_infile_,start,len);
call prxposn(re,2,start,len);
data = substr(_infile_,start,len);
run;
Caveat emptor: A wise person said that when you solve a problem using regular expressions, now you have two problems :). Among the things that can go wrong:
- line breaks
- using
'
instead of"
for string delimiters - lengths
- mixed types
来源:https://stackoverflow.com/questions/36444012/parse-json-object-in-sas-macro-part-2-using-output-function-to-handle-nested