I have a very large data set (18 million observations) that I would like to transpose by subsetting based on one variable and creating 900 new variables out of those sub/ets. Ex
I'm not sure what your subsetting would be based on, but if it's only the id, it is rather straightforward.
Using the example from this ucla's page on proc sort, your example would work fine using slightly modified input data:
data long;
input id year faminc ;
datalines ;
1 96 40000
1 97 40500
1 98 41000
2 96 45000
2 97 45400
2 98 45800
3 96 75000
3 97 76000
3 98 77000
;
proc sort data=long;
by year;
run;
proc transpose data=long out=wide(drop=_name_) prefix=var;
by year;
var faminc;
run;
year var1 var2 var3
96 40000 45000 75000
97 40500 45400 76000
98 41000 45800 77000
UCLA's SAS help pages are much clearer than SAS's own as regards to using proc transpose
in both directions. Here are 4 valuable links...
Long to Wide with Proc Transpose
Wide to Long with Proc Transpose
Long to Wide with Data Step
Wide to Long with Data Step
Transpose steps like this are easily done with the array
statement. Example for your code would be something like:
data new (keep = year var1-var3);
set data;
by year;
array vars {3} var1-var3;
retain var1-var3;
if first.year then i=1;
else i+1;
vars{i} = faminc;
if last.year then output;
run;
But I don't have SAS to double check this. Also see here.
There are no good transpose procedures in PROC SQL.Too limited.
Here are links to well tested in battle macros from the book "Data Preparation for Analytics"
From the sas community wiki you can download macro definitions.
Goto http://support.sas.com/kb/32/121.html for transposing from WIDE to LONG and http://support.sas.com/kb/32/122.html for transposing from LONG to WIDE.
More Details on "Data Preparation for Analytics" can be found under http://www.sascommunity.org/wiki/Data_Preparation_for_Analytics, where you can download all macros, programs and datasets for the book.
Hope this help