How to remove duplicated records\observations WITHOUT sorting in SAS？

前端未结

关注

 8  1642

I wonder if there is a way to unduplicate records WITHOUT sorting?Sometimes, I want to keep original order and just want to remove duplicated records.

相关标签:

8条回答

南旧

2021-02-08 15:02
The two examples given in the original post are not identical.
- distinct in proc sql only removes lines which are fully identical
- nodupkey in proc sort removes any line where key variables are identical (even if other variables are not identical). You need the option noduprecs to remove fully identical lines.
If you are only looking for records having common key variables, another solution I could think of would be to create a dataset with only the key variable(s) and find out which one are duplicates and then apply a format on the original data to flag duplicate records. If more than one key variable is present in the dataset, one would need to create a new variable containing the concatenation of all the key variable values - converted to character if needed.
0 讨论(0)
发布评论:

提交评论
- 加载中...

情话喂你

2021-02-08 15:03

This is the fastest way I can think of. It requires no sorting.

data output_data_name;
    set input_data_name (
        sortedby = person_id stay
        keep =
            person_id
            stay
            ... more variables ...);
    by person_id stay;
    if first.stay > 0 then output;
run;

0 讨论(0)

上一页 1 2