Dropping a table in SAS

夙愿已清 提交于 2019-12-04 01:13:07

If it is reasonable to outsource to the OS, that might be fastest. Otherwise, my unscientific observations seem to suggest that drop table in proc sql is fastest. This surprised me as I expected proc datasets to be fastest.

In the code below, I create 4000 dummy data sets then try deleting them all with different methods. The first is with sql and on my system took about 11 seconds to delete the files.

The next two both use proc datasets. The first creates a delete statement for each data set and then deletes. The second just issues a blanket kill command to delete everything in the work directory. (I had expected this technique to be the fastest). Both proc datasets routines reported about 20 seconds to delete all 4000 files.

%macro create;
proc printto log='null';run;
%do i=1 %to 4000;
data temp&i;
x=1;
y="dummy";
output;run;
%end;
proc printto;run;
%mend;

%macro delsql;
proc sql;
%do i=1 %to 4000;
drop table temp&i;
%end;
quit;
%mend;

%macro deldata1;
proc datasets library=work nolist;
   %do i=1 %to 4000;
   delete temp&i.;
   %end;
run;quit;
%mend;

%macro deldata2;
proc datasets library=work kill;
run;quit;
%mend;

option fullstimer;
%create;
%delsql;

%create;
%deldata1;

%create;
%deldata2;

I tried to fiddle with the OS-delete approach.

Deleting with the X-command can not be recommended. It took forever!

I then tried with the system command in a datastep:

%macro delos;
data _null_;
do i=1 to 9;
delcmd="rm -f "!!trim(left(pathname("WORK","L")))!!"/temp"!!trim(left(put(i,4.)))!!"*.sas7*";
rc=system(delcmd);
end;
run;
%mend;

As you can see, I had to split my deletes into 9 separate delete commands. The reason is, I'm using wildcards, "*", and the underlying operating system (AIX) expands these to a list, which then becomes too large for it to handle...

The program basically constructs a delete command for each of the nine filegroups "temp[1-9]*.sas7*" and issues the command.

Using the create macro function from cmjohns answer to create 4000 data tables, I can delete those in only 5 seconds using this approach.

So a direct operating system delete is the fastest way to mass-delete, as I expected.

We are discussing tables or datasets?

Tables implies database tables. To get rid of these in a fast way, using proc SQL pass-through facility would be the fastest. Specifically if you can connect to the database once and drop all of the tables, then disconnect.

If we are discussing datasets in SAS, I would argue that both proc sql and proc datasets are extremely similar. From an application standpoint, they both go through the same deduction to create a system command that deletes a file. All testing I have seen from SAS users groups or presentations have always suggested that the use of one method over the other is marginal and based on many variables.

If it is imperative that you have the absolute fastest way to drop the datasets / tables, you may just have to test it. Each install and setup of SAS is different enough to warrant testing.

In terms of which is faster, excluding extremely large data, I would wager that there is little difference between them. When handling permanent SAS datasets, however, I like to use PROC DATASETS rather than PROC SQL, simply because I feel better manipulating permanent datasets using the SAS-designed method, and not the SQL implementation

Simple Solution for temporary tables that are named similarly:

If all of your tables start with the same prefix, for example p1_table1 and p1_table2, then the following code will delete any table with that starts with p1

proc datasets; delete p1: ; run;

proc delete is another, albeit undocumented, solution..

http://www.sascommunity.org/wiki/PROC_Delete

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!