I have a Python
script that cleans up and performs basic statistical calculations on a large panel dataset (2,000,000+ observations
).
I find t
This answer extends @Roberto Ferrer's answer, solving a few issues I ran into.
Stata in system path
For stata
to run code, it must be correctly set up in the system path (on Windows at least). At least for me, this was not automatically set up on installing Stata, and i found the simplest correction was to put in the full path (which for me was "C:\Program Files (x86)\Stata12\Stata-64
) i.e.:
cmd = ["C:\Program Files (x86)\Stata12\Stata-64","do", dofile]`
How to quietly run the code in the background
It is possible to get the code to run quietly in the background (i.e. not opening up Stata each time), by adding the command /e
i.e.
cmd = ["C:\Program Files (x86)\Stata12\Stata-64,"/e","do", dofile]
Log file storage location
Finally, if you are running quietly in the background, Stata will will want to save log files. It will do this in cmd
's working directory. This must vary depending on where the code is being run from, but for me, since i was executing Python from Notepad++, it wanted to save the log files in C:\Program Files (x86)\Notepad++
, which Stata did not have write-access to. This can be changed by specifying the working directory when the sub-process is called.
These modifications to Roberto Ferrer
's code lead to:
def dostata(dofile, *params):
cmd = ["C:\Program Files (x86)\Stata12\Stata-64","/e","do", dofile]
for param in params:
cmd.append(param)
return (subprocess.call(cmd, cwd=r'C:\location_to_save_log_files'))