I have to dump data from SAS datasets. I found a Python module called sas7bdat.py that says it can read SAS .sas7bdat datasets, and I think it would be simpler and more straigh
As time passes, solutions become easier. I think this one is easiest if you want to work with pandas:
import pandas as pd
df = pd.read_sas('/support/sas/locked_data.sas7bdat')
Note that it is easy to get a numpy array by using df.values
Personally I think the better approach would be to export the data using SAS then process the external file as needed using Python.
In SAS, you can do this...
libname datalib "/support/sas";
filename sasdump "/support/textfiles/locked_data.txt";
proc export
data = datalib.locked_data
outfile = sasdump
dbms = tab
label
replace;
run;
The downside to this is that while the column labels are used rather than the variable names, the labels are enclosed in double quotes. When processing in Python, you may need to programmatically remove them if they cause a problem. I hope that helps even though it doesn't use Python like you wanted.
I know I'm late for the answer, but in case someone searches for similar question. The best option is:
import sas7bdat
from sas7bdat import *
foo = SAS7BDAT('/support/sas/locked_data.sas7bdat')
# This converts to dataframe:
ds = foo.to_data_frame()
This is only a partial answer as I've found no [easy to read] concrete documentation.
You can view the source code here
This shows some basic info regarding what arguments the methods require, such as:
I think most of what you are after is stored in the "header" class returned when creating an object with SAS7BDAT. If you just print that class you'll get a lot of info, but you can also access class attributes as well. I think most of what you may be looking for would be under foo.header.cols. I suspect you use various header attributes as parameters for the methods you mention.
Maybe something like this will get you closer?
from sas7bdat import SAS7BDAT
foo = SAS7BDAT(inFile) #your file here...
for i in foo.header.cols:
print '"Atrributes"', i.attr
print '"Labels"', i.label
print '"Name"', i.name
edit: Unrelated to this specific question, but the type() and dir() commands come in handy when trying to figure out what is going on in an unfamiliar class/library