How to detect how many observations in a dataset (or if it is empty), in SAS?

后端 未结 7 1167
慢半拍i
慢半拍i 2020-11-29 04:47

I wonder if there is a way of detecting whether a data set is empty, i.e. it has no observations. Or in another saying, how to get the number of observations in a specific d

相关标签:
7条回答
  • 2020-11-29 05:10

    I guess I am trying to reinvent the wheel here with so many answers already. But I do see some other methods trying to count from the actual dataset - this might take a long time for huge datasets. Here is a more efficient method:

    proc sql;
    select nlobs from sashelp.vtable where libname = "library" and memname="dataset";
    quit;
    
    0 讨论(0)
  • 2020-11-29 05:14

    There are lots of different ways, I tend to use a macro function with open() and attrn(). Below is a simple example that works great most of the time. If you are going to be dealing with data views or more complex situations like having a data set with records marked for deletion or active where clauses, then you might need more robust logic.

    %macro nobs(ds);
        %let DSID=%sysfunc(OPEN(&ds.,IN));
        %let NOBS=%sysfunc(ATTRN(&DSID,NOBS));
        %let RC=%sysfunc(CLOSE(&DSID));
        &NOBS
    %mend;
    
    /* Here is an example */
    %put %nobs(sashelp.class);
    
    0 讨论(0)
  • 2020-11-29 05:24

    A slightly different approach:

    proc contents data=library.dataset out=nobs;
    run;
    
    proc summary data=nobs nway;
    class nobs;
    var delobs;
    output out=nobs_summ sum=;
    run;
    

    This will give you a dataset with one observation; the variable nobs has the value of number of observations in the dataset, even if it is 0.

    0 讨论(0)
  • 2020-11-29 05:27

    Proc sql is not efficient when we have large dataset. Though using ATTRN is good method but this can accomplish within base sas, here is the efficient solution that can give number of obs of even billions of rows just by reading one row:

    data DS1;
    set DS nobs=i;
    if _N_ =2 then stop;
    No_of_obs=i;
    run;
    
    0 讨论(0)
  • 2020-11-29 05:30

    Here's the more complete example that @cmjohns was talking about. It will return 0 if it is empty, -1 if it is missing, and has options to handle deleted observations and where clauses (note that using a where clause can make the macro take a long time on very large datasets).

    Usage Notes:

    This macro will return the number of observations in a dataset. If the dataset does not exist then -1 will be returned. I would not recommend this for use with ODBC libnames, use it only against SAS tables.

    Parameters:

    • iDs - The libname.dataset that you want to check.
    • iWhereClause (Optional) - A where clause to apply
    • iNobsType (Optional) - Either NOBS OR NLOBSF. See SASV9 documentation for descriptions.

    Macro definition:

    %macro nobs(iDs=, iWhereClause=1, iNobsType=nlobsf, iVerbose=1);
      %local dsid nObs rc;
    
      %if "&iWhereClause" eq "1" %then %do;
        %let dsID = %sysfunc(open(&iDs));
      %end;
      %else %do;
        %let dsID = %sysfunc(open(&iDs(where=(&iWhereClause))));
      %end;
    
      %if &dsID %then %do;
        %let nObs = %sysfunc(attrn(&dsID,nlobsf));
        %let rc   = %sysfunc(close(&dsID));
      %end;
      %else %do;
        %if &iVerbose %then %do;
          %put WARNING: MACRO.NOBS.SAS: %sysfunc(sysmsg());      
        %end;
        %let nObs  = -1;
      %end;
      &nObs
    %mend;
    

    Example Usage:

    %put %nobs(iDs=sashelp.class);
    %put %nobs(iDs=sashelp.class, iWhereClause=height gt 60);
    %put %nobs(iDs=this_dataset_doesnt_exist);
    

    Results

    19
    12
    -1
    

    Installation

    I recommend setting up a SAS autocall library and placing this macro in your autocall location.

    0 讨论(0)
  • 2020-11-29 05:33

    The trick is producing an output even when the dataset is empty.

    data CountObs;
    
        i=1;
        set Dataset_to_Evaluate point=i nobs=j; * 'point' avoids review of full dataset*;
        No_of_obs=j;
        output;  * Produces a value before "stop" interrupts processing *;
        stop;   * Needed whenever 'point' is used *;
        keep No_of_obs;
    run;
    
    proc print data=CountObs;
    run;
    

    The above code is the simplest way I've found to produce the number of observations even when the dataset is empty. I've heard NOBS can be tricky, but the above can work for simple applications.

    0 讨论(0)
提交回复
热议问题