I\'m trying to convert *.xhtml with Hebrew characters (UTF-8) to PDF by using iText library but I getting all letter in reverse order. As far I understand from this question I
Please take a look at the ParseHtml10 example. In this example, we have take the file hebrew.html:
<html>
<head>
<title>Hebrew text</title>
</head>
<body style="font-size:12.0pt; font-family:Arial">
<div dir="rtl" style="font-family: Noto Sans Hebrew">שלום עולם</div>
</body>
</html>
And we convert it to PDF using this code:
public void createPdf(String file) throws IOException, DocumentException {
// step 1
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
// step 3
document.open();
// step 4
// Styles
CSSResolver cssResolver = new StyleAttrCSSResolver();
XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider(XMLWorkerFontProvider.DONTLOOKFORFONTS);
fontProvider.register("resources/fonts/NotoSansHebrew-Regular.ttf");
CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
// Pipelines
PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new FileInputStream(HTML), Charset.forName("UTF-8"));;
// step 5
document.close();
}
The result looks like hebrew.pdf:
What are the hurdles you need to take?
<div>
or a <td>
.dir="rtl"
to define the direction.I can't read Hebrew, but I hope the resulting PDF is correct and that this solves your problem.
Important: this solution requires at least iText and XML Worker 5.5.5, because support for the dir
attribute was introduced in 5.5.4 and improved in 5.5.5.