python-docx

Python-docx: Is it possible to add a new run to paragraph in a specific place (not at the end)

懵懂的女人 提交于 2019-12-06 16:14:19
I want to set a style to a corrected word in MS Word text. Since it's not possible to change text style inside a run, I want to insert a new run with new style into the existing paragraph... for p in document.paragraphs: for run in p.runs: if 'text' in run.text: new_run= Run() new_run.text='some new text' # insert this run into paragraph # smth like: p.insert(new_run) How to do it? p.add_run() adds run to the end of paragraph, doesn't it? Update The best would be to be able to clone run (and insert it after a certain run). This way we reproduce the original run's style attributes in the new

How to read tables in multiple docx files in a same folder by python

早过忘川 提交于 2019-12-06 16:10:00
I have one folder called "Test_Plan". It consist multiple docx files and each docx file has multiple tables. My question is how can I read the whole docx files and give the output? For example, all docx files has multiple tables, I'm picking one docx file and give the output like (i.e) Total Number of Tables: 52 Total Number of YES Automations: 6 Total Number of NO Automations: 5 Like this I need to automate the whole number of files in that "Test_Plan" folder. Hope you understand my question. My code for read tables from single docx file: #Module to retrive the word documents from docx import

Extracting MS Word document formatting elements along with raw text information

一世执手 提交于 2019-12-06 13:46:44
In this post @mikemaccana describes how to use python-docx to extract raw text data from an MS Word document from within python. I'd like to go one step further. Instead of simple extracting the raw text information, can I also use this module to harvest information about font face (e.g. bold versus italic) or font size (e.g. 12 versus 18pt). The closest I was able to come was this post asking about using this module to extract highlighted text entries. Seemed a little abstract, and I'm not totally sure what's going on here. Is there a more straightforward way to extract formatting information

Search and Replace in python-docx

陌路散爱 提交于 2019-12-06 09:15:52
I have a document (template) with the following string: "Hello, my name is Bob. Bob is a nice name." I would like to open this document using python-docx and use "find and replace" method (if exists) to change every single string "Bob" -> "Mark". At the end I would like to generate a new document with a string "Hello, my name is Mark. Mark is a nice name." How can I do that? from docx import * TEMPLATE_FILE = 'test_template.docx' class generate_docx: @staticmethod def test(): document = Document(TEMPLATE_FILE) body = document.xpath('/w:document/w:body', namespaces=nsprefixes)[0] body = replace

how to create a dataframe from a table in a word document (.docx) file using pandas

人盡茶涼 提交于 2019-12-06 05:47:32
问题 I have a word file (.docx) with table of data, I am trying to create a pandas data frame using that table, I have used docx and pandas module. But I could not create a data frame. from docx import Document document = Document('req.docx') for table in document.tables: for row in table.rows: for cell in row.cells: print (cell.text) and also tried to read table as df pd.read_table("path of the file") I can read the data cell by cell but I want to read the entire table or any particular column.

How do I copy the contents of a word document?

自古美人都是妖i 提交于 2019-12-06 05:44:19
问题 I want to write a program that copies text from a Word document and pastes it to another. I'm trying to do that using the python-docx library. I was able to do that with the following code, but it does not copy the bold , italic , underlined nor colored parts as they are and only their text: from docx import Document input = Document('SomeDoc.docx') paragraphs = [] for para in input.paragraphs: p = para.text paragraphs.append(p) output = Document() for item in paragraphs: output.add_paragraph

highlight text using python-docx

爱⌒轻易说出口 提交于 2019-12-06 04:13:26
I want to highlight text in docx and save it another file. here is my code from docx import Document def highlight_text(filename): doc = Document(filename) for p in doc.paragraphs: if 'vehicle' in p.text: inline = p.runs # print(inline) # Loop added to work with runs (strings with same style) for i in range(len(inline)): # print((inline[i].text).encode('ascii')) if 'vehicle' in inline[i].text: x=inline[i].text.split('vehicle') inline[i].clear() for j in range(len(x)-1): inline[i].add_text(x[j]) y=inline[i].add_text('vehicle') y.highlight_color='YELLOW' # print (p.text) doc.save('t2.docx')

Cell spanning multiple columns in table using python-docx

不打扰是莪最后的温柔 提交于 2019-12-06 03:03:18
I'm trying to create a table that looks like this, using the python-docx module. Working from the example code for creating a table in example-makedocument.py and reading through the code in docx.py, I thought something similar to this would work: tbl_rows = [ ['A1'], ['B1', 'B2' ], ['C1', 'C2' ] ] tbl_colw = [ [100], [25, 75], [25, 75] ] tbl_cwunit = 'pct' body.append(table(tbl_rows, colw=tbl_colw, cwunit=tbl_cwunit)) however this corrupts the docx document, and when Word recovers the document the table is shown as this: How can I get a row to properly span multiple columns using python-docx?

How to rotate text in table cells?

家住魔仙堡 提交于 2019-12-06 00:07:14
I'm trying to make table like this: As you can see, the header is vertically orientated. How can I achieve this using python-docx? P.S. Sorry for non-translated table. Snippet for those who are too tired to seek: from docx.oxml import OxmlElement from docx.oxml.ns import qn from docx.table import _Cell def set_vertical_cell_direction(cell: _Cell, direction: str): # direction: tbRl -- top to bottom, btLr -- bottom to top assert direction in ("tbRl", "btLr") tc = cell._tc tcPr = tc.get_or_add_tcPr() textDirection = OxmlElement('w:textDirection') textDirection.set(qn('w:val'), direction) # btLr

Using docx python library, how to apply color and font size simultaneously

落花浮王杯 提交于 2019-12-04 21:29:34
I am writing to an .docx file using python docx library. I want to prespecify the font size and color of a paricular sentence. My problem is that I am not able to do it simultaneously. Let me illustrate - from docx import Document from docx.shared import Pt #Helps to specify font size from docx.shared import RGBColor #Helps to specify font Color document=Document() #Instantiation p=document.add_heading(level=0) p.add_run('I want this sentence colored red with fontsize=22').font.size=Pt(22) #Specifies fontsize 22 p.add_run('This line gets colored red').font.color.rgb=RGBColor(255,0,0)