How to convert OpenDocument spreadsheets to a pandas DataFrame?

前端未结

关注

 11  1577

The Python library pandas can read Excel spreadsheets and convert them to a pandas.DataFrame with pandas.read_excel(file) command. Under the hood,

相关标签:

11条回答

星月不相逢

2020-12-23 19:58

Here is a quick and dirty hack which uses ezodf module:

import pandas as pd
import ezodf

def read_ods(filename, sheet_no=0, header=0):
    tab = ezodf.opendoc(filename=filename).sheets[sheet_no]
    return pd.DataFrame({col[header].value:[x.value for x in col[header+1:]]
                         for col in tab.columns()})

Test:

In [92]: df = read_ods(filename='fn.ods')

In [93]: df
Out[93]:
     a    b    c
0  1.0  2.0  3.0
1  4.0  5.0  6.0
2  7.0  8.0  9.0

NOTES:

all other useful parameters like header, skiprows, index_col, parse_cols are NOT implemented in this function - feel free to update this question if you want to implement them
ezodf depends on lxml make sure you have it installed

0 讨论(0)

无人共我

2020-12-23 19:59

There is support for reading Excel files in Pandas (both xls and xlsx), see the read_excel command. You can use OpenOffice to save the spreadsheet as xlsx. The conversion can also be done automatically on the command line, apparently, using the convert-to command line parameter.

Reading the data from xlsx avoids some of the issues (date formats, number formats, unicode) that you may run into when you convert to CSV first.

0 讨论(0)
发布评论:

提交评论
- 加载中...
甜味超标

2020-12-23 20:05

If you only have a few .ods files to read, I would just open it in openoffice and save it as an excel file. If you have a lot of files, you could use the unoconv command in Linux to convert the .ods files to .xls programmatically (with bash)

Then it's really easy to read it in with pd.read_excel('filename.xls')

0 讨论(0)
发布评论:

提交评论
- 加载中...
盖世英雄少女心

2020-12-23 20:05

Some responses have pointed out that odfpy or other external packages are needed to get this functionality, but note that in recent versions of Pandas (current is 1.1, August-2020) there is support for ODS format in functions like pd.ExcelWriter() and pd.read_excel(). You only need to specify the propper engine "odf" to be able of working with OpenDocument file formats (.odf, .ods, .odt).

0 讨论(0)
发布评论:

提交评论
- 加载中...
日久生厌

2020-12-23 20:08
This is available natively in pandas 0.25. So long as you have odfpy installed (conda install odfpy OR pip install odfpy) you can do
```
pd.read_excel("the_document.ods", engine="odf")
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2