When I'm a converting docx document to pdf my national characters transform into "#" marks.
Is there any way to set a font encoding for pdf documents?
I used xdocreport in the past and it can handle that, but I had problems with images, headers and footers.
Docx4j manages to do this, but not fonts. After conversion, fonts have ANSI encoding while I'd like to have windows-1250. Is there an option to set this?
My problem was - missing proper True Type Fonts on linux server. The default fonts where inserted instead (without my code pages).
I solved the problem installing the default Ms Windows fonts via ttf-mscorefonts-installer
On debian:
apt-get install ttf-mscorefonts-installer
I have the same problem and found, that as you mentioned by yourself, a font problem. The font on the system needs to support your encoding.
e.g: for documents using the "Arial" font, german umlaut characters are shown as "?".
I found an other solution, to override the PDF font encoding as following:
//
// read template
//
File docxFile = new File(System.getProperty("user.dir") + "/" + "Test.docx");
InputStream in = new FileInputStream(docxFile);
//
// prepare document context
//
IXDocReport report = XDocReportRegistry.getRegistry().loadReport(in, TemplateEngineKind.Velocity);
IContext context = report.createContext();
context.put("name", "Michael Küfner");
//
// generate PDF output
//
Options options = Options.getTo(ConverterTypeTo.PDF).via(ConverterTypeVia.XWPF);
PdfOptions pdfOptions = PdfOptions.create();
pdfOptions.fontEncoding("iso-8859-15");
options.subOptions(pdfOptions);
OutputStream out = new FileOutputStream(new File(docxFile.getPath() + ".pdf"));
report.convert(context, options, out);
Try setting the attribute in pdfOptions.fontEndcoding (in my case "iso-8859-15") to your needs.
Setting this to "UTF-8", which seams to be the default, resulted in the same problem with special chars.
Another thing I found:
Using the "Calibri" font, which is default for Word 2007/2010, the problem did not occur, even when using UTF-8 encoding. Maybe the embedded Type-1 Arial Font in iText, which is used for generating PDFs, does not support UTF-8 encoding.
来源:https://stackoverflow.com/questions/12327977/how-to-change-font-encoding-when-converting-docx-pdf-with-docx4j