问题
I am using a macro to loop through files based on names and extract data which works fine for the majority of the cases, however from time to time I experience
ERROR: BY variables are not properly sorted on data set CQ.CQM_20141113.
where CQM_20141113 is the file I am extracting data from. In fact my macro loops through CQ.CQM_2014:
and it works up until 20141113. Because of this single failure the file is then not created.
I am using a data step view to "initialize" the data and then in a further step to call data step view (code sample with shortened where conditions):
%let taq_ds = CQ.CQM_2014:;
data _v_&tables / view=_v_&tables;
set &taq_ds;
by sym_root date time_m; *<= added by statement
format sym_root date time_m;
where sym_root = &stock;
run;
data xtemp2_&stockfiname (keep = sym_root year date iprice);
retain sym_root year date iprice;
set _v_&tables;
by sym_root date time_m;
/* some conditions */
run;
When I see the error via the log file and I run the file again, then it works (sometimes I need a few trials).
I was thinking of a proc sort, but how to do that when using data step view? Please note the cqm-files are very large (which could also be the root of the problem).
edit: taq_ds
is not a single file but runs through several files whose name start with CQM_2014
, i.e. CQM_20140101, CQM_20140102, etc.
回答1:
Based on the code provided, you could replace your first data step view with a SQL one:
proc sql;
create view _v_&tables as
select * from &taq_ds
where sym_root = &stock
order by sym_root, date, time_m;
Alternatively you could prefix your data step view with a similar view. This would enforce the ordering needed for the subsequent by
statement.
回答2:
Creating an index on taq_ds
corresponding to the by
group order would also solve this, e.g.:
proc datasets lib=<library containing taq_ds>;
modify taq_ds;
index create index1=(sym_root date time_m);
run;
quit;
来源:https://stackoverflow.com/questions/50089963/sas-data-step-view-error-by-variable-not-sorted-properly