问题
I've been trying to create a PDF file from content that can be English, Persian, digits or a combination of them.
there is some problems with Persian texts like: "این یک متن فارسی است"
۱- the text must be written from right to left
2- there is a difference between characters in different positions in the word (meaning that characters change their shape according to their surrounding characters)
3- because the sentence is read from right to left then the normal textwrap doesn't work correctly.
回答1:
After working for a while with Reportlab, we had some problems with organizing and formatting it. It took a lot of time and was kind of complicated. So we decided to work with pdfkit and jinja2. This way we can format and organize in html and CSS and we don't need to reformat Persian text too.
first we can design an html template file like the one below:
<!DOCTYPE html> <html> <head lang="fa-IR"> <meta charset="UTF-8"> <title></title> </head> <body > <p dir="rtl">سوابق کاری</p> <ul dir="rtl"> {% for experience in experiences %} <li><a href="{{ experience.url }}">{{ experience.title }}</a></li> {% endfor %} </ul> </body> </html>
and then we use jinja2 library to render our data into Template, and then use pdfkit to create a pdf from render result:
from jinja2 import Template
from pdfkit import pdfkit
sample_data = [{'url': 'http://www.google.com/', 'title': 'گوگل'},
{'url': 'http://www.yahoo.com/fa/', 'title': 'یاهو'},
{'url': 'http://www.amazon.com/', 'title': 'آمازون'}]
with open('template.html', 'r') as template_file:
template_str = template_file.read()
template = Template(template_str)
resume_str = template.render({'experiences': sample_data})
options = {'encoding': "UTF-8", 'quiet': ''}
bytes_array = pdfkit.PDFKit(resume_str, 'string', options=options).to_pdf()
with open('result.pdf', 'wb') as output:
output.write(bytes_array)
回答2:
I used reportlab for creating PDf but unfortunately reportlab doesn't support Arabic and Persian alphabet so I used 'rtl' library by Vahid Mardani and 'pybidi' library by Meir Kriheli to make the text look right in PDF result.
first we need to add a font that supports Persian to reportlab:
in ubuntu 14.04:
copy Bahij-Nazanin-Regular.ttf into /usr/local/lib/python3.4/dist-packages/reportlab/fonts folder
add font and styles to reportlab:
from reportlab.lib.enums import TA_RIGHT from reportlab.pdfbase import pdfmetrics from reportlab.pdfbase.ttfonts import TTFont pdfmetrics.registerFont(TTFont('Persian', 'Bahij-Nazanin-Regular.ttf')) styles = getSampleStyleSheet() styles.add(ParagraphStyle(name='Right', alignment=TA_RIGHT, fontName='Persian', fontSize=10))
in next step we need to reshape Persian text Letters to the right shape and make the direction of each word from right to left:
from bidi.algorithm import get_display
from rtl import reshaper
import textwrap
def get_farsi_text(text):
if reshaper.has_arabic_letters(text):
words = text.split()
reshaped_words = []
for word in words:
if reshaper.has_arabic_letters(word):
# for reshaping and concating words
reshaped_text = reshaper.reshape(word)
# for right to left
bidi_text = get_display(reshaped_text)
reshaped_words.append(bidi_text)
else:
reshaped_words.append(word)
reshaped_words.reverse()
return ' '.join(reshaped_words)
return text
and for adding bullet or wrapping the text we could use following function:
def get_farsi_bulleted_text(text, wrap_length=None):
farsi_text = get_farsi_text(text)
if wrap_length:
line_list = textwrap.wrap(farsi_text, wrap_length)
line_list.reverse()
line_list[0] = '{} •'.format(line_list[0])
farsi_text = '<br/>'.join(line_list)
return '<font>%s</font>' % farsi_text
return '<font>%s •</font>' % farsi_text
for testing the code we can write:
from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
doc = SimpleDocTemplate("farsi_wrap.pdf", pagesize=letter, rightMargin=72, leftMargin=72, topMargin=72,
bottomMargin=18)
Story = []
text = 'شاید هنوز اندروید نوقا برای تمام گوشیهای اندرویدی عرضه نشده باشد، ولی اگر صاحب یکی از گوشیهای نکسوس یا پیک' \
'سل باشید احتمالا تا الان زمان نسبتا زیادی را با آخرین نسخهی اندروید سپری کردهاید. اگر در کار با اندروید نوقا' \
' دچار مشکل شدهاید، با دیجیکالا مگ همراه باشید تا با هم برخی از رایجترین مشکلات گزارش شده و راه حل آنها را' \
' بررسی کنیم. البته از بسیاری از این روشها در سایر نسخههای اندروید هم میتوانید استفاده کنید. اندروید برخلاف iOS ' \
'روی گسترهی وسیعی از گوشیها با پوستهها و اپلیکیشنهای اضافی متنوع نصب میشود. بنابراین تجویز یک نسخهی مشترک برا' \
'ی حل مشکلات آن کار چندان سادهای نیست. با این حال برخی روشهای عمومی وجود دارد که بهتر است پیش از هر چیز آنها را' \
' بیازمایید.'
tw = get_farsi_bulleted_text(text, wrap_length=120)
p = Paragraph(tw, styles['Right'])
Story.append(p)
doc.build(Story)
回答3:
In case anyone wants to generate pdfs from html templates using Django
, this is how it can be done:
template = get_template("app_name/template.html")
context = Context({'something':some_variable})
html = template.render(context)
pdf = pdfkit.from_string(html, False)
response = HttpResponse(pdf, content_type='application/pdf')
response['Content-Disposition'] = 'attachment; filename=output.pdf'
return response
来源:https://stackoverflow.com/questions/41345450/how-to-create-pdf-containing-persianfarsi-text-with-reportlab-rtl-and-bidi-in