The Road to ‘Ideograph Hell’…

Dr Ken Lunde
5 min readAug 10, 2023

is paved with turtles and dragons.

Figuratively, of course.

As we shall see in this article, that road is also paved with a whole bunch of other ideographs. Considering that there are nearly 100K Han ideographs in the Unicode Standard, the topic is effectively a bottomless pit, or infinite tunnel

I mean, heck (sorry, hell), there are three standard forms of the ideographs that represent those two creatures, along with an emoji for good measure:

Turtle: U+9F9C 龜, U+4E80 亀, U+9F9F 龟, U+1F422 🐢 TURTLE
Dragon: U+9F8D 龍, U+7ADC 竜, U+9F99 龙, U+1F409 🐉 DRAGON

Those are the traditional, Japanese simplified, and Chinese simplified ideographs, respectively.

What makes matters worse is that all six ideographs are also encoded in the Kangxi Radicals or CJK Radicals Supplement blocks at completely separate code points:

Turtle: U+2FD4 ⿔, U+2EF2 ⻲, U+2EF3 ⻳
Dragon: U+2FD3 ⿓, U+2EEF ⻯, U+2EF0 ⻰

We’re not done, because there are also instances of both in the CJK Compatibility Ideographs block:

Turtle: U+F907 龜, U+F908 龜, U+FACE 龜
Dragon: U+F9C4 龍

Naturally, it doesn’t stop there, hence the entire point of this somewhat brief article.

Turtles

Case in point, a non-trivial number of turtles are present at the end of the CJK Unified Ideographs Extension B (nine), CJK Unified Ideographs Extension E (one), and CJK Unified Ideographs Extension F (five) blocks, and their code chart excerpts are shown below:

We can even find several turtle specimens in the registered Moji_Joho IVD (Ideographic Variation Database) collection as unifiable variants of U+9F9C 龜:

Dragons

Just like with turtles, and in keeping with the title of this article, there is something comparable for dragons (aka U+9F8D 龍) in the same IVD collection:

However, unlike there being a myriad of encoded variants of turtle, dragon—including its Chinese simplified form—exhibits multiplicity, as evidenced by the the five ideographs whose code chart excerpts are shown below:

Note that U+2EE5D is in the CJK Unified Ideographs Extension I block, which will be included in Unicode Version 15.1 that will be released next month.

The others are in the CJK Unified Ideographs (two), CJK Unified Ideographs Extension B (one), and CJK Unified Ideographs Extension G (one) blocks.

In terms of future repertoires, there is even a half dragon in IRG Working Set 2021, specifically at Serial Number 00005. This IRG working set is expected to become the CJK Unified Ideographs Extension J block, and in terms of its standardization timeframe, my best guess is Unicode Version 17.0 (2025).

Clouds, Dragons & Teeth

Knowing that U+3106C in the CJK Unified Ideographs Extension G block has 84 strokes, and consists of three clouds and three dragons, is a wonderful problem to have (hint: the image below is animated):

Did I mention teeth? Interestingly, the greater the simplification, the less teeth there appear to be:

Teeth: U+9F52 齒, U+6B6F 歯, U+9F7F 齿

Besides the three standard forms that mirror those for turtle and dragon, I found the following three gems in the CJK Unified Ideographs Extension B (two) and CJK Unified Ideographs Extension F (one) blocks:

I could go on and on, but I won’t. I covered only Radicals 211 through 213, so the topic is genuinely deep.

For those who would like to explore this topic further, a good place to start is UTN (Unicode Technical Note) #43, which documents the provisional kStrange property of the Unihan database that was added in Unicode Version 14.0 (2021).

About the Author

Dr Ken Lunde has worked for Apple as a Font Developer since 2021-08-02 (and was in the same role as a contractor from 2020-01-16 through 2021-07-30), is the author of CJKV Information Processing Second Edition (O’Reilly Media, 2009), and earned BA (1987), MA (1988), and PhD (1994) degrees in linguistics from The University of Wisconsin-Madison. Prior to working at Apple, he worked at Adobe for over twenty-eight years — from 1991-07-01 to 2019-10-18 — specializing in CJKV Type Development, meaning that he architected and developed fonts for East Asian typefaces, along with the standards and specifications on which they are based. He architected and developed the Adobe-branded “Source Han” (Source Han Sans, Source Han Serif, and Source Han Mono) and Google-branded “Noto CJK” (Noto Sans CJK and Noto Serif CJK) open source Pan-CJK typeface families that were released in 2014, 2017, and 2019, and published over 300 articles on Adobe’s now-static CJK Type Blog. Ken serves as the Unicode Consortium’s IVD (Ideographic Variation Database) Registrar, attends UTC and IRG meetings, participates in the Unicode Editorial Committee, became an individual Unicode Life Member in 2018, received the 2018 Unicode Bulldog Award, was a Unicode Technical Director from 2018 to 2020, became a Vice-Chair of the Emoji Subcommittee in 2019, published UTN #43 (Unihan Database Property “kStrange) in 2020, became the Chair of the CJK & Unihan Group in 2021, published UTN #45 (Unihan Property History) in 2022, and published UTN #50 (KP-Source Property Value History) in 2023. He and his wife, Hitomi, are proud owners of a His & Hers pair of acceleration-boosted 2018 LR Dual Motor AWD Tesla Model 3 EVs.

--

--

Dr Ken Lunde

Chair, CJK & Unihan Working Group—Almaden Valley—San José—CA—USA—NW Hemisphere—Terra—Sol—Orion-Cygnus Arm—Milky Way—Local Group—Laniakea Supercluster