When building PDF documents with OpenType fonts in iText, I want to access glyph variants from within the font -- specifically tabular figures. Since OpenType glyph variants do
This does not seem to be possible neither in the latest tag 5.5.8, nor in the master branch of iText.
As explained in this article and in the Microsoft's OpenType font file specification, glyph variants are stored in the Glyph Substitution Table (GSUB)
of a font file. Accessing the glyph variants requires reading this table from the file, which is actually implemented in the class com.itextpdf.text.pdf.fonts.otf.GlyphSubstitutionTableReader
, though this class is disabled for now.
The call readGsubTable()
in the class com.itextpdf.text.pdf.TrueTypeFontUnicode
is commented out.
void process(byte ttfAfm[], boolean preload) throws DocumentException, IOException {
super.process(ttfAfm, preload);
//readGsubTable();
}
It turns out that this line is disabled for a reason, as the code actually does not work if you try to activate it.
So, unfortunately, there is no way to use glyph variants, as the substitution information is never loaded from the font file.
Update
The original answer was about possibility to use iText API
for accessing glyph variants out of the box, which is not there yet. However, the low level code is in place and can be used after some hacking to access the glyph substitution mapping table.
When called read()
, the GlyphSubstitutionTableReader
reads the GSUB
table and flattens substitutions of all features into one map Map<Integer, List<Integer>> rawLigatureSubstitutionMap
. The symbolic names of the features are currently discarded by OpenTypeFontTableReader
. The rawLigatureSubstitutionMap
maps a glyphId
variant to a base glyphId
, or a ligature glyphId
to a sequence of glyphIds
like this:
629 -> 66 // a.feature -> a
715 -> 71, 71, 77 // ffl ligature
This mapping can be reversed to get all variants for a base glyphId
. So all extended glyphs with unknown unicode values can be figured out through their connection to a base glyph, or a sequence of glyphs.
Next, to be able to write a glyph to PDF, we need to know a unicode value for that glyphId
. A relationship unicode -> glyphId
is mapped by a cmap31
field in TrueTypeFont
. Reversing the map gives unicode by glyphId.
Tweaking
rawLigatureSubstitutionMap
cannot be accessed in GlyphSubstitutionTableReader
, as it's a private
member and does not have a getter accessor. The simplest hack would be to copy-paste the original class and add a getter for the map:
public class HackedGlyphSubstitutionTableReader extends OpenTypeFontTableReader {
// copy-pasted code ...
public Map<Integer, List<Integer>> getRawSubstitutionMap() {
return rawLigatureSubstitutionMap;
}
}
Next problem is that GlyphSubstitutionTableReader
needs an offset for GSUB
table, information that is stored in protected HashMap<String, int[]> tables
of TrueTypeFont
class. A helper class placed into same package will bridge access to the protected members of TrueTypeFont
.
package com.itextpdf.text.pdf;
import com.itextpdf.text.pdf.fonts.otf.FontReadingException;
import java.io.IOException;
import java.util.List;
import java.util.Map;
public class GsubHelper {
private Map<Integer, List<Integer>> rawSubstitutionMap;
public GsubHelper(TrueTypeFont font) {
// get tables offsets from the font instance
Map<String, int[]> tables = font.tables;
if (tables.get("GSUB") != null) {
HackedGlyphSubstitutionTableReader gsubReader;
try {
gsubReader = new HackedGlyphSubstitutionTableReader(
font.rf, tables.get("GSUB")[0], glyphToCharacterMap, font.glyphWidthsByIndex);
gsubReader.read();
} catch (IOException | FontReadingException e) {
throw new IllegalStateException(e.getMessage());
}
rawSubstitutionMap = gsubReader.getRawSubstitutionMap();
}
}
/** Returns a glyphId substitution map
*/
public Map<Integer, List<Integer>> getRawSubstitutionMap() {
return rawSubstitutionMap;
}
}
It would be nicer to extend TrueTypeFont
, but that would not work with factory methods createFont()
of BaseFont
, which relies on hard coded class names when creating a font.