I have to create word documents dynamically using python-docx. I do it by adding table rows dynamically and there is no way to know how many records fit on a page because it depends on the specific data.
I need to know when a new element added to the document (table row or paragraph) causes a new page, so I can record some data in the database accordingly with the information that each page contains.
This is the code for the word document generation with python-docx:
def get_invoice_word_report(self, request, invoices_controllers):
import unicodedata
from django.core.files import File
from docx import Document
from docx.shared import Inches, Pt
from docx.enum.text import WD_ALIGN_PARAGRAPH, WD_BREAK
from docx.enum.table import WD_ALIGN_VERTICAL
from docx.enum.table import WD_TABLE_ALIGNMENT
document = Document()
section = document.sections[-1]
section.left_margin = Inches(0.5)
section.right_margin = Inches(0.5)
style = document.styles['Normal']
font = style.font
font.name ='Arial'
font.size = Pt(8)
i = 0
for invoices_controller in invoices_controllers:
context = invoices_controller.get_context()
if i > 0:
if i == len(invoices_controllers) - 1:
last = context['invoices']['invoice_number']
first = context['invoices']['invoice_number']
document.add_paragraph("Invoice".format(context['invoices']['invoice_number'])).alignment = WD_ALIGN_PARAGRAPH.RIGHT
document.add_paragraph("Folio {}".format(context['invoices']['invoice_number'])).alignment = WD_ALIGN_PARAGRAPH.RIGHT
date = context['invoices']['period_end_date'].split('-')
document.add_paragraph("{} {} {}".format(date[2], date[1], date[0])).alignment = WD_ALIGN_PARAGRAPH.RIGHT
table = document.add_table(rows=1, cols=4)
hdr_cells = table.rows[0].cells
hdr_cells[0].width = Inches(0.1)
hdr_cells[1].width = Inches(10)
hdr_cells[2].width = Inches(1)
hdr_cells[3].width = Inches(1)
for entry in context['invoices']['entries']:
row_cells = table.add_row().cells
row_cells[0].text = str(entry['amount'])
row_cells[1].text = entry['line']
row_cells[2].text = entry['unit_price_label']
row_cells[2].paragraphs[0].alignment= WD_ALIGN_PARAGRAPH.RIGHT
row_cells[3].text = entry['subtotal']
row_cells[3].paragraphs[0].alignment= WD_ALIGN_PARAGRAPH.RIGHT
if entry['text']:
text_cells = table.add_row().cells
text_cells[1].text = entry['text']
row_cells = table.add_row().cells
row_cells[0].text = ''
row_cells[1].text = ''
row_cells[2].text = ''
row_cells[3].text = context['total']
row_cells[3].paragraphs[0].alignment = WD_ALIGN_PARAGRAPH.RIGHT
row_cells = table.add_row().cells
row_cells[0].text = ''
row_cells[1].text = ''
row_cells[2].text = ''
row_cells[3].text = '$0.00'
row_cells[3].paragraphs[0].alignment = WD_ALIGN_PARAGRAPH.RIGHT
row_cells = table.add_row().cells
row_cells[0].text = ''
row_cells[1].text = ''
row_cells[2].text = ''
row_cells[3].text = context['total']
row_cells[3].paragraphs[0].alignment = WD_ALIGN_PARAGRAPH.RIGHT
run = document.add_paragraph("Son {}".format(context['total_text'])).add_run()
i += 1
current_directory = settings.MEDIA_DIR
if len(invoices_controllers) > 1:
file_name = "Invoices {}-{}.docx".format(first, last)
file_name = "Invoice {}.docx".format(first)
document.save(current_directory + file_name)
return request.get_host()+ settings.MEDIA_URL + file_name
Thanks for your help.
Detecting automatic (renderer-generated) page breaks in python-docx
is not possible because those breaks are not reliably recorded in the XML.
You may be able to find some indication of the last rendered page break, depending on where your .docx files came from. Otherwise you probably need to use the Microsoft VBA interface to gain access to a live renderer which may be able to provide you this information. Note the page break location is subject to change based on the machine Word is running on, depending on factors like font metrics and printer drivers.
This has come up in other questions and answers. This one might be a good place to start: Page number python-docx
To see the rest, search on "[python-docx] page break" and you'll see there are quite a few. The square bracketed part limits results to those tagged with "python-docx".