To CID, Or Not To CID

Dr Ken Lunde
10 min readNov 22, 2020

By Dr Ken Lunde, Janitor, Spirits of Christmas Past

Photo by Stéfan on Instagram. Used by permission.

That is the question…

This article, which is intended to convey tidbits from nearly three decades of experience in developing East Asian type (aka fonts), uses parlance that only those in type development are likely to understand—or appreciate.

What Are CIDs?

This article is about CIDs, and as the main topic of this article, CID is an abbreviation for Character ID or Character Identifier, which refers to glyphs in a font resource through the use of integer values in the range 0 through 65534. Such font resources are CID-keyed, because their glyphs are associated with CIDs. The original CID-keyed fonts were resources that were installed into PostScript printers, which were used with compatible CMap resources to map character codes, such as Shift-JIS or Unicode, to CIDs in CID-keyed fonts.

In terms of modern font formats, CID-keyed fonts are deployed as OpenType/CFF fonts, which means that such fonts include a CFF (Compact Font Format) table whose glyphs are CID-keyed. I refer to such fonts as CID-keyed OpenType/CFF. The other side of the coin are name-keyed fonts whose glyphs are associated with names that are comprised of a string of letters, digits, and a limited number of symbols. The AGL Specification is the definitive resource for glyph names in name-keyed fonts.

There is one important characteristic of CID-keyed fonts that needs to be pointed out, because it will become important later in this article:

CIDs need not be contiguous.

In other words, a CID-keyed font that includes CIDs 0, 2 through 10, and 12 is completely valid.

In addition, the only CID that is actually required in a CID-keyed font is CID+0, which serves as the mandatory .notdef (not defined) glyph. The smallest functional CID-keyed font therefore includes only two glyphs: CIDs 0 and 1.

Adobe Blank 2, designed and developed by yours truly over five years ago and released on April 1st of 2015, is an example of a CID-keyed font that includes only CIDs 0 and 1.

In terms of practical examples, when developing Japanese fonts that include only the glyphs for kana (かな), punctuation, and some symbols, it makes sense to use the Adobe-Japan1-7 ROS (Registry, Ordering, and Supplement) so that its resources, such as pre-defined OpenType features and CMap resources, can be leveraged. Thousands of CIDs that correspond to kanji (漢字) and other symbols are explicitly excluded. See the Kana Subset Definitions section of the Adobe-Japan1-7 specification for more information about kana subsets.

What Are GIDs?

GID is an abbreviation for Glyph ID or Glyph Identifier. Name-keyed OpenType/CFF and TrueType fonts use GIDs. CID-keyed OpenType/CFF fonts also use GIDs for all tables except for the CFF one, and the CFF table maintains a mapping between CIDs and GIDs.

Unlike CIDs, which need not be contiguous, GIDs must be contiguous. If the CIDs in a CID-keyed font are completely contiguous, which is very common, CIDs equal GIDs.

CIDs and GIDs are related in that CID-keyed OpenType/CFF fonts reference GIDs in tables other than the CFF one, such as in the cmap, GPOS, and GSUB tables. As stated before, the CFF table maintains a mapping between CIDs and GIDs. To use the example from earlier in this article—a CID-keyed font that includes CIDs 0, 2 through 10, and 12—its equivalent GID range is simply 0 through 10, as shown below:

CID GID
0 0
2 1
3 2
4 3
5 4
6 5
7 6
8 7
9 8
10 9
12 10

Why Develop CID-keyed Fonts?

I am glad that you asked, because not all fonts need to be CID-keyed, though some fonts certainly benefit from being CID-keyed. Below is a short series of bullet points to consider:

  • Mappings from Unicode code points to CIDs are easily controlled via a UTF-32 CMap resource that is specified as the argument of the AFDKO (Adobe Font Development Kit for OpenType) makeotf tool’s “-ch” command-line option. This is arguably more important when multiple Unicode code points map to the same glyph, which is common in East Asian fonts. Of course, the AFDKO GlyphOrderAndAliasDB file serves this function for name-keyed fonts.
  • The order of CIDs is implicit according to their integer values, and is not subject to change unless intentionally done. The order of glyphs in name-keyed fonts, on the other hand, is subject to both intentional and unintentional manipulation.
  • For CID-keyed fonts that adhere to one of Adobe’s public character collections (aka glyph sets)—Adobe-CNS1-7, Adobe-GB1-5, Adobe-Japan1-7, or Adobe-KR-9—existing resources, such as UTF-32 CMap resources, GSUB feature definitions, and UVS (Unicode Variation Sequence) definition files, can be easily leveraged to ease development.
  • Adobe InDesign requires that a font be CID-keyed OpenType/CFF or TrueType in order to support custom Mojikumi (文字組み) settings. This is mainly a concern for Japanese fonts, and to some extent, Chinese fonts as well.
  • Fonts that include glyphs for multiple scripts—such as East Asian ones that typically include a basic set of glyphs for Latin, and often glyphs for Greek and Cyrillic—benefit from multiple FDArray (Font Dictionary Array) elements that can specify different hinting parameters, such as stem widths and alignment zones.
  • Even within a given script, such as Latin, multiple FDArray elements provide type developers the flexibility to apply different hinting parameters, such as alignment zones if they are too close for related glyphs. A good example are standard versus cap-height digits, both of which may be present in a font.

With all that stated, nothing prevents a type developer from making name-keyed fonts, which are the default. In fact, for many purposes, name-keyed fonts are perfectly fine, even East Asian ones.

Developing CID-keyed Fonts

Developing CID-keyed OpenType/CFF fonts from name-keyed font source data is most easily accomplished by using the AFDKO mergefonts and makeotf tools, at least in my experience. Although published nearly 14 years ago, Adobe Tech Note #5900, AFDKO Version 2.0 Tutorial: mergeFonts, rotateFont & autohint, is a helpful resource to understand how the AFDKO mergefonts tool can be used to manipulate and develop CID-keyed fonts.

Converting Name-keyed Fonts to CID-keyed Fonts

Converting existing name-keyed OpenType/CFF fonts into CID-keyed OpenType/CFF fonts is, not surprisingly, most easily accomplished by using the same AFDKO mergefonts tool. The only requirements are that the following two files be supplied:

  1. A cidfontinfo file that is specified as the argument of the “-cid” command-line option.
  2. A file that maps glyph names to CIDs.

Below is a minimal cidfontinfo file whose values come from the infamous Adobe Blank:

FontName     (AdobeBlank)
FullName (Adobe Blank)
FamilyName (Adobe Blank)
version (1.045)
Registry (Adobe)
Ordering (Identity)
Supplement 0
isFixedPitch true

Assuming that a name-keyed font maps the glyphs named .notdef, space, and A through C to GIDs 0 through 4, the following is a complete mergefonts mapping file that maps those glyphs to the corresponding CIDs:

mergefonts
0 .notdef
1 space
2 A
3 B
4 C

Note: The first line of an AFDKO mergefonts mapping file must include “mergefonts.”

The following command line results in a CIDFont resource that be used by the AFDKO makeotf tool to produce a CID-keyed OpenType/CFF font:

mergefonts -cid cidfontinfo cidfont.ps font.map <name-keyed font>

For those who are comfortable using Python and FontTools, it is very easy to write a Python script—or define functions—that create the necessary files so that the AFDKO mergefonts tool can be used to convert an existing name-keyed OpenType/CFF font into a CID-keyed one that advertises the special-purpose Adobe-Identity-0 ROS. The following methods can be used to create the mergefonts mapping file, which maps glyph names to CIDs that necessarily equal GIDs:

fontTools.ttLib.TTFont.getGlyphOrder()
fontTools.ttLib.TTFont.getGlyphID()

The first method is used to create a list of the glyph names in the font, and the second method is used to map those glyph names to GIDs. The combination is then used to synthesize a mergefonts mapping file.

The following method is then used to extract the necessary name table strings, such as name.ID=4, name.ID=6 and name.ID=16, or name.ID=1 if name.ID=16 does not exist, to create a minimal cidfontinfo file that specifies the special-purpose Adobe-Identity-0 ROS:

fontTools.ttLib.TTFont["name"].getDebugName()

The process involves the following four relatively simple steps:

  1. Execute the AFDKO tx tool to extract the CFF table of a name-keyed OpenType/CFF font as a Type 1 (aka PFA or Printer Font ASCII) font
  2. Execute the AFDKO mergefonts tool to create a CIDFont resource
  3. Execute the AFDKO tx tool to convert the CIDFont resource into a stand-alone CID-keyed CFF resource (and subroutinizing it while we’re at it by specifying the “+S” command-line option)
  4. Execute the AFDKO sfntedit tool to replace the original name-keyed CFF table with the stand-alone CID-keyed CFF resource

Sort of like the following:

tx -t1 <name-keyed OTF> font.pfa
mergefonts -cid cidfontinfo cidfont.ps font.map font.pfa
tx -cff +S font.pfa font.cff
sfntedit -a CFF=font.cff <name-keyed OTF>

Multiple FDArray Elements

Given that one of the greatest advantages of CID-keyed fonts is the ability to have more than one FDArray element, for greater control over hinting parameters, I will explain how this is easily done using the AFDKO mergefonts tool.

The diagram below shows how multiple mergefonts mapping files can be used, either with a single name-keyed source font or multiple name-keyed source fonts, to create a CID-keyed font with multiple FDArray elements:

Consider the following three AFDKO mergefonts mapping files that specify the desired /FontName of the FDArray element as the argument of “mergefonts” on the first line of each file:

mergefonts Example-Regular-Latin
0 .notdef
1 A
2 B
3 C
mergefonts Example-Regular-Kana
4 uni3042
5 uni3044
6 uni3046
mergefonts Example-Regular-Ideographs
7 uni4E00
8 uni4E09
9 uni4E8C

Note: If a name is not specified as an argument of “mergefonts” on the first line, the /FontName of the source font is used as the /FontName of the resulting FDArray element of the CID-keyed font. This is the best way to control the number of FDArray elements in a CID-keyed font.

The first mapping file maps glyphs named .notdef, A, B, and C to CIDs 0 through 3 in an FDArray element named Example-Regular-Latin, the second one maps glyphs named uni3042 (あ), uni3044 (い), and uni3046 (う) to CIDs 4 through 6 in an FDArray element named Example-Regular-Kana, and the third one maps glyphs for uni4E00 (一), uni4E09 (三), and uni4E8C (二) to CIDs 7 through 9 in an FDArray element named Example-Regular-Ideographs.

If all of the glyphs are in a single name-keyed source font, the following command line would be used, assuming that the AFDKO mergefonts mapping files are named 1.map, 2.map, and 3.map:

mergefonts -cid cidfontinfo cidfont.ps 1.map font.pfa 2.map font.pfa 3.map font.pfa

Of course, all three FDArray elements would inherit the same hinting parameters as specified in the name-keyed source font, font.pfa, which means that adjusting the hinting parameters becomes a post-processing step.

If the glyphs for each intended FDArray element are in separate name-keyed source fonts, named 1.pfa, 2.pfa, and 3.pfa, and with hinting parameters specified as intended, the following command line would be used:

mergefonts -cid cidfontinfo cidfont.ps 1.map 1.pfa 2.map 2.pfa 3.map 3.pfa

Of course, a UTF-32 CMap resource would need to be created that maps the appropriate Unicode code points to CIDs 1 through 9, which would be used with the AFDKO makeotf tool to build a functional CID-keyed OpenType/CFF font. Show below is an excerpt of the CMap resource mapping blocks:

6 begincidchar
<00003042> 4
<00003044> 5
<00003046> 6
<00004e00> 7
<00004e09> 8
<00004e8c> 9
endcidchar
1 begincidrange
<00000041> <00000043> 1
endcidrange

Plenty of example CMap resources can be found in the open source CMap Resources project, and the industrial-strength cmap-tool.pl Perl script that I developed for compiling CMap resources can be found in the open source Perl Scripts project.

Variable Fonts & CFF2

Variable Fonts are quickly becoming more prevalent, with more and more environments being able to consume them. The notion of CID-keyed no longer applies to Variable Fonts, because the CFF2 (Compact Font Format Version 2) table uses only GIDs. However, the CFF2 table does support multiple FDArray elements, and I developed the very first examples that are available in the open source Variable Font Collection Test project, either as three six- or twelve-font Variable Font Collections or as twelve individual Variable Fonts.

Still, not all fonts need to be developed as Variable Fonts, so I suspect that CID-keyed OpenType/CFF fonts, along with name-keyed ones, will continue to be developed for the foreseeable future.

In closing, the topic of CID-keyed fonts is quite deep, and to some extent I feel as though I barely scratched the surface…

About the Author

Dr Ken Lunde worked at Adobe for over twenty-eight years — from 1991-07-01 to 2019-10-18 — specializing in CJKV Type Development, meaning that he architected and developed fonts for East Asian typefaces, along with the standards and specifications on which they are based. He architected and developed the Adobe-branded “Source Han” (Source Han Sans, Source Han Serif, and Source Han Mono) and Google-branded “Noto CJK” (Noto Sans CJK and Noto Serif CJK) open source Pan-CJK typeface families that were released in 2014, 2017, and 2019, is the author of CJKV Information Processing Second Edition (O’Reilly Media, 2009), and published over 300 articles on Adobe’s now-static CJK Type Blog. Ken earned BA (1987), MA (1988), and PhD (1994) degrees in linguistics from The University of Wisconsin-Madison, served as Adobe’s representative to the Unicode Consortium since 2006, was Adobe’s primary representative from 2015 until 2019, serves as Unicode’s IVD (Ideographic Variation Database) Registrar, attends UTC and IRG meetings, participates in the Unicode Editorial Committee, became an individual Unicode Life Member in 2018, received the 2018 Unicode Bulldog Award, was a Unicode Technical Director from 2018 to 2020, became a Vice-Chair of the Emoji Subcommittee in 2019, published UTN #43 (Unihan Database Property “kStrange”) in 2020, and became the Chair of the CJK & Unihan Group in 2021. He and his wife, Hitomi, are proud owners of a His & Hers pair of acceleration-boosted 2018 LR AWD Tesla Model 3 EVs.

--

--

Dr Ken Lunde

Chair, CJK & Unihan Working Group—Almaden Valley—San José—CA—USA—NW Hemisphere—Terra—Sol—Orion-Cygnus Arm—Milky Way—Local Group—Laniakea Supercluster