How to check Fully embedded and subset embedded font using PDFBOX

你。 提交于 2019-12-11 08:51:30

问题


Hi I want to check fully embedding and subset embedding of fonts in PDF using PDFBOX. I have tried using the following logic to check:


private boolean IsEmbedded(Map<String, PDFont> fontsMap, Set<String> keys) {
    for(String key:keys) {
        PDFont font = fontsMap.get(key);
        PDFontDescriptor  fontDescriptor = font.getFontDescriptor();
        if(null != fontDescriptor && fontDescriptor instanceof PDFontDescriptorDictionary){
            PDFontDescriptorDictionary fontDescriptorDictionary = (PDFontDescriptorDictionary)fontDescriptor;
            if(null == fontDescriptorDictionary.getFontFile() && null == fontDescriptorDictionary.getFontFile2() && null == fontDescriptorDictionary.getFontFile3())
                return false;
        }
    }
    return true;
}

But seems I could not able to find out how to differentiate between Fully Embedding or sub-set embedding. Can anyone please give me the answer?


回答1:


To quote the PDF specification ISO 32000-1 on font subsets (section 9.6.4):

PDF documents may include subsets of Type 1 and TrueType fonts. The font and font descriptor that describe a font subset are slightly different from those of ordinary fonts. These differences allow a conforming reader to recognize font subsets and to merge documents containing different subsets of the same font. (For more information on font descriptors, see 9.8, "Font Descriptors".)

For a font subset, the PostScript name of the font — the value of the font’s BaseFont entry and the font descriptor’s FontName entry — shall begin with a tag followed by a plus sign (+). The tag shall consist of exactly six uppercase letters; the choice of letters is arbitrary, but different subsets in the same PDF file shall have different tags.

EXAMPLE EOODIA+Poetica is the name of a subset of Poetica®, a Type 1 font.

In a PDF following up to this requirement ("shall", so it really is a requirement) you, therefore, can recognize subset fonts by their name.

Keep in mind, though, that outside of PDFs you can derive a font from another one by including only selected glyphs. This essentially creates a subset font but a PDF creating software making use of it may not notice that fact and name it as a fully embedded font. So in essence you can never know for sure.



来源:https://stackoverflow.com/questions/21392432/how-to-check-fully-embedded-and-subset-embedded-font-using-pdfbox

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!