Convert .CSV files to .DTA files in Python

后端 未结 2 1954
孤街浪徒
孤街浪徒 2021-01-24 11:11

I\'m looking to automate the process of converting many .CSV files into .DTA files via Python. .DTA files is the filetype that is handled by the Stata Statistics language.

相关标签:
2条回答
  • 2021-01-24 11:36

    (copypasting from my answer to a previous question)

    pandas DataFrame objects now have a "to_stata" method. So you can do for instance

    import pandas as pd
    df = pd.read_stata('my_data_in.dta')
    df.to_stata('my_data_out.dta')
    

    DISCLAIMER: the first step is quite slow (in my test, around 1 minute for reading a 51 MB dta - also see this question), and the second produces a file which can be way larger than the original one (in my test, the size goes from 51 MB to 111MB). Spacedman's answer may look less elegant, but it is probably more efficient.

    0 讨论(0)
  • 2021-01-24 11:38

    You need rpy2 for Python and also the foreign package installed in R. You do that by starting R and typing install.packages("foreign"). You can then quit R and go back to Python.

    Then this:

    import rpy2.robjects as robjects
    robjects.r("require(foreign)")
    robjects.r('x=read.csv("test.csv")')
    robjects.r('write.dta(x,"test.dta")')
    

    You can construct the string passed to robjects.r from Python variables if you want, something like:

    robjects.r('x=read.csv("%s")' % fileName)
    
    0 讨论(0)
提交回复
热议问题