问题
I'm trying to generate a PDF report consisting sentences in multiple languages. For that I'm using Google NOTO fonts but google CJK fonts doesn't support some of the Latin special characters due to that my PDF box is failing to generate a report or sometime shows weird characters.
Any one has any appropriate solution ? I tried multiple things but unable to find a single TTF file that can support all Unicode. I also tried fallback to different font file but that will be too much work.
Languages I support:- Japanese, German, Spanish, Portuguese, English.
Note:- I don't want to use arialuni.ttf file due to Licensing issue.
Can anyone suggest anything.
回答1:
Here is the code that will be in release 2.0.14 in the examples subproject:
/**
* Output a text without knowing which font is the right one. One use case is a worldwide
* address list. Only LTR languages are supported, RTL (e.g. Hebrew, Arabic) are not
* supported so they would appear in the wrong direction.
* Complex scripts (Thai, Arabic, some Indian languages) are also not supported, any output
* will look weird. There is an (unfinished) effort here:
* https://issues.apache.org/jira/browse/PDFBOX-4189
*
* @author Tilman Hausherr
*/
public class EmbeddedMultipleFonts
{
public static void main(String[] args) throws IOException
{
try (PDDocument document = new PDDocument())
{
PDPage page = new PDPage(PDRectangle.A4);
document.addPage(page);
PDFont font1 = PDType1Font.HELVETICA; // always have a simple font as first one
TrueTypeCollection ttc2 = new TrueTypeCollection(new File("c:/windows/fonts/batang.ttc"));
PDType0Font font2 = PDType0Font.load(document, ttc2.getFontByName("Batang"), true); // Korean
TrueTypeCollection ttc3 = new TrueTypeCollection(new File("c:/windows/fonts/mingliu.ttc"));
PDType0Font font3 = PDType0Font.load(document, ttc3.getFontByName("MingLiU"), true); // Chinese
PDType0Font font4 = PDType0Font.load(document, new File("c:/windows/fonts/mangal.ttf")); // Indian
PDType0Font font5 = PDType0Font.load(document, new File("c:/windows/fonts/ArialUni.ttf")); // Fallback
try (PDPageContentStream cs = new PDPageContentStream(document, page))
{
cs.beginText();
List<PDFont> fonts = new ArrayList<>();
fonts.add(font1);
fonts.add(font2);
fonts.add(font3);
fonts.add(font4);
fonts.add(font5);
cs.newLineAtOffset(20, 700);
showTextMultiple(cs, "abc 한국 中国 भारत 日本 abc", fonts, 20);
cs.endText();
}
document.save("example.pdf");
}
}
static void showTextMultiple(PDPageContentStream cs, String text, List<PDFont> fonts, float size)
throws IOException
{
try
{
// first try all at once
fonts.get(0).encode(text);
cs.setFont(fonts.get(0), size);
cs.showText(text);
return;
}
catch (IllegalArgumentException ex)
{
// do nothing
}
// now try separately
int i = 0;
while (i < text.length())
{
boolean found = false;
for (PDFont font : fonts)
{
try
{
String s = text.substring(i, i + 1);
font.encode(s);
// it works! Try more with this font
int j = i + 1;
for (; j < text.length(); ++j)
{
String s2 = text.substring(j, j + 1);
if (isWinAnsiEncoding(s2.codePointAt(0)) && font != fonts.get(0))
{
// Without this segment, the example would have a flaw:
// This code tries to keep the current font, so
// the second "abc" would appear in a different font
// than the first one, which would be weird.
// This segment assumes that the first font has WinAnsiEncoding.
// (all static PDType1Font Times / Helvetica / Courier fonts)
break;
}
try
{
font.encode(s2);
}
catch (IllegalArgumentException ex)
{
// it's over
break;
}
}
s = text.substring(i, j);
cs.setFont(font, size);
cs.showText(s);
i = j;
found = true;
break;
}
catch (IllegalArgumentException ex)
{
// didn't work, will try next font
}
}
if (!found)
{
throw new IllegalArgumentException("Could not show '" + text.substring(i, i + 1) +
"' with the fonts provided");
}
}
}
static boolean isWinAnsiEncoding(int unicode)
{
String name = GlyphList.getAdobeGlyphList().codePointToName(unicode);
if (".notdef".equals(name))
{
return false;
}
return WinAnsiEncoding.INSTANCE.contains(name);
}
}
Alternatives to arialuni can be found here: https://en.wikipedia.org/wiki/Open-source_Unicode_typefaces
来源:https://stackoverflow.com/questions/54248117/pdfbox-not-supporting-multiple-languages