python -docx to extract table from word docx

孤者浪人 提交于 2019-12-19 02:08:39

问题


I know this is a repeated question but those answers are not works for me. I have a word file which consist one table now i want that table as a output of my python program. I'm using python 3.6 and i have installed python -docx as well. Here is my code for the data extraction

from docx.api import Document

document = Document('test_word.docx')
table = document.tables[0]

data = []

keys = None
for i, row in enumerate(table.rows):
    text = (cell.text for cell in row.cells)

    if i == 0:
        keys = tuple(text)
        continue
    row_data = dict(zip(keys, text))
    data.append(row_data)
    print (data)

I want the result what exactly looks in the word docx file. Thanks in advance


回答1:


Your code works fine for me. How about inserting it into a dataframe?

import pandas as pd
from docx.api import Document

document = Document('test_word.docx')
table = document.tables[0]

data = []

keys = None
for i, row in enumerate(table.rows):
    text = (cell.text for cell in row.cells)

    if i == 0:
        keys = tuple(text)
        continue
    row_data = dict(zip(keys, text))
    data.append(row_data)
    print (data)

df = pd.DataFrame(data)

How can i display particular row and column in that table? We can extract rows and cols based on index with iloc

# iloc[row,columns] 
df.iloc[0,:].tolist() # [5,6,7,8]  - row index 0
df.iloc[:,0].tolist() # [5,9,13,17]  - column index 0
df.iloc[0,0] # 5  - cell(0,0)
df.iloc[1:,2].tolist() # [11,15,19]  - column index 2, but skip first row

and so on...

However, if your columns have names (in this case it is numbers) you can do it like this:

#df["name"].tolist() 
df[1].tolist() # [5,6,7,8] - column with name 1 

print(df)

prints, which is how the table looks like in my sample doc.

    1   2   3   4
0   5   6   7   8
1   9   10  11  12
2   13  14  15  16
3   17  18  19  20


来源:https://stackoverflow.com/questions/46618718/python-docx-to-extract-table-from-word-docx

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!