Extract table from Powerpoint

こ雲淡風輕ζ 提交于 2019-12-23 01:53:11

问题


I am trying to extract table from a PPT using python-pptx, however, the I am not sure how do I that using shape.table.

from pptx import Presentation
prs = Presentation(path_to_presentation)
# text_runs will be populated with a list of strings,
# one for each text run in presentation
text_runs = []
for slide in prs.slides:
  for shape in slide.shapes:
    if shape.has_table:
      tbl = shape.table
      rows = tbl.rows.count
      cols = tbl.columns.count

I found a post here but the accepted solution does not work, giving error that count attribute is not available.

How do I modify the above code so I can get a table in a dataframe?

EDIT

Please see the image of the slide below


回答1:


This appears to work for me.


prs = Presentation((path_to_presentation))
# text_runs will be populated with a list of strings,
# one for each text run in presentation
text_runs = []
for slide in prs.slides:
    for shape in slide.shapes:
        if not shape.has_table:
            continue    
        tbl = shape.table
        row_count = len(tbl.rows)
        col_count = len(tbl.columns)
        for r in range(0, row_count):
            for c in range(0, col_count):
                cell = tbl.cell(r,c)
                paragraphs = cell.text_frame.paragraphs 
                for paragraph in paragraphs:
                    for run in paragraph.runs:
                        text_runs.append(run.text)

print(text_runs)```







来源:https://stackoverflow.com/questions/54419118/extract-table-from-powerpoint

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!