spreadsheet to python dictionary conversion

拟墨画扇 提交于 2019-11-30 17:57:00

Some available options:

  • pyexcel-ods: "A wrapper library to read, manipulate and write data in ods format." Can be installed via: pip install pyexcel-ods. I personally recommend this package as I've used it and it is being actively maintained.

  • py-odftools: "... a collection of tools for analyzing, converting and creating files in the ISO standard OpenDocument format." This project hasn't been updated since late 2007. It looks abandoned.

  • ezodf: "A Python package to create/manipulate OpenDocumentFormat files." Installable via pip install ezodf. See caveat in the comments below about a serious issue with this package.

Although you could ask your users to File>Save As (as you probably know), this might not be useful in your situation.

It's probably easier to use the libre/openoffice service. It can be run completely headless on a server without needing X11 installed or running, and that will give you a clean native conversion.

libreoffice --without-x --convert-to csv  filename.ods

Check libreoffice --help (or openoffice --help) for details. This could also be wrapped in os.system(), subprocess.*(), etc. (Note: use -convert-to on Windows.) Also note: you cannot already be running any instances of Libre/Open/Star office, including the quickstarter.

Update: prior versions of LibreOffice used --headless instead of --without-x.

Can you convert the .ODS to a csv first? Then parsing CSV using Python is pretty easy using the csv module.

Check py-odftools.

There's a great article on Linux Journal how to read ods in python. Ods file is a juz zip file containing xml file inside. You can than parse xml file to read all cells.

http://www.linuxjournal.com/article/9347?page=0,2

This approach from the link below works awesomely for me reading/loading *.ods files into python dataframe. You can choose to load by sheet index or by sheet name.

Peeped my solution from this project: https://pypi.org/project/pandas-ods-reader/

You might first need to install these dependencies: ezodf,lxml and pandas before continuing.

pip install pandas_ods_reader

from pandas_ods_reader import read_ods

Then:

filepath = "path/to/your/file.ods"

Doing loading of sheets based on indices (index 1 based)

sheet_idx = 1
df = read_ods(filepath, sheet_idx)

Doing loading of sheets based on sheet names

sheet_name = "sales_year_1"

df = read_ods(filepath, sheet_name)

Done.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!