How to read tables in multiple docx files in a same folder by python

百般思念 提交于 2019-12-23 03:42:00

问题


I have one folder called "Test_Plan". It consist multiple docx files and each docx file has multiple tables. My question is how can I read the whole docx files and give the output? For example, all docx files has multiple tables, I'm picking one docx file and give the output like

(i.e)
Total Number of Tables: 52
Total Number of YES Automations: 6
Total Number of NO Automations: 5

Like this I need to automate the whole number of files in that "Test_Plan" folder. Hope you understand my question.

My code for read tables from single docx file:

#Module to retrive the word documents

from docx import Document
doc = Document("sample2.docx")


#Reading the tables in the particular docx

i = 0
for t in doc.tables:
    for ro in t.rows:
        if ro.cells[0].text=="ID" :
            i=i+1
print("Total Number of Tables: ", i)


#Counting the values of Automation
 # This will count how many yes automation

j=0
for table in doc.tables:
    for ro in table.rows:
        if ro.cells[0].text=="Automated Test Case" and (ro.cells[2].text=="yes" or ro.cells[2].text=="Yes"):
            j=j+1
print("Total Number of YES Automations: ", j)


#This part is used to count the No automation values

k = 0
for t in doc.tables:
    for ro in t.rows:
        if ro.cells[0].text=="Automated Test Case" and (ro.cells[2].text=="no" or ro.cells[2].text=="No"):
            k=k+1
print("Total Number of NO Automations: ", k)

Output:


回答1:


You can use glob to find all your files, e.g:

import glob
for name in glob.glob('Test_Plan/*.docx'):
    doc = Document(name)
    ...

glob will return a list of file names that match the given pattern. You can loop through that list, as shown above by the for loop and open every file in turn. After opening the files you can just plug in your code. Of course, you will have to initialize your variables before the loop.

For splitting the file names I would suggest to use the following approach:

import os.path

path, filename = os.path.split(input)


来源:https://stackoverflow.com/questions/46910260/how-to-read-tables-in-multiple-docx-files-in-a-same-folder-by-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!