The Unicode® CJK & Unihan Group

Dr Ken Lunde
6 min readMay 15, 2022

By Dr Ken Lunde

Something markedly positive resulted from COVID-19, and I am not referring to testing positive. That something, of course, is the title of this article. We will explore why and how this group was formed, and why it will continue to be relevant into the future.

The Past

Unicode Technical Committee (UTC) meetings have traditionally been in-person and lasted an entire week (five days). For the past several years, the first-, second-, and fourth-quarter meetings took place in Silicon Valley, and were hosted by Full Member companies, such as Adobe, Apple, IBM, and Google. The third-quarter meeting took place in Redmond, Washington, and was hosted by Microsoft. For those who could not attend in person—for whatever reason—the preferred way to join a UTC meeting was by speaker phone. That certainly sounds rather 20th century, right? Don’t worry, we will get to that.

At UTC meetings, CJK- and Unihan-related documents and public feedback were discussed during plenary, and on several occasions there was not enough time to go through everything.

And then COVID-19 dropped on our doorstep…

The last time that an in-person UTC meeting took place was UTC #162 in January of 2020, which was hosted by Google in Mountain View, California. (It was a four-day meeting, but I missed the last day due to it being my first day working at Apple as a contractor.)

Baby Pearl” Tesla Model 3 parked at Googleplex for the UTC #162 meeting (January 15, 2020)

The Present

Normally, there was minimal preparation needed before attending UTC meetings, at least for me. The exceptions were the Emoji Subcommittee and the Script Ad Hoc Group. The former meets twice a week for an hour, and the latter meets once a month for an entire day. Both review incoming documents and public feedback, and produce a report with recommendations for the UTC. COVID-19 didn’t have much of an effect on them.

For the three remaining UTC meetings of 2020, I took it upon myself to form the Unihan Ad Hoc, which was modeled after the Script Ad Hoc Group, but was to be focused on CJK- and Unihan-related documents and public feedback. The meetings have taken place from 6 to 9PM on a Friday, in California, which is Saturday morning in East Asia. The idea is simple:

Predigest any CJK- and Unihan-related documents and public feedback that have come in since the previous UTC meeting by discussing them among the experts who attend the quarterly Unihan Ad Hoc meeting, then prepare a report that provides recommendations for the UTC. I have written all of the reports thus far, which pretty much takes the entire weekend. The meeting attendees review one or more drafts of the report before it is submitted for posting to the UTC Document Register.

For the very first meeting, which took place on the evening of 2020-04-10 and which was in preparation for the UTC #163 meeting later that month, I gathered all of the relevant documents and public feedback, then provided them as a ZIP file to the five other people who signed up to attend. This turned out to be a good practice, because a severe server outage took place at about the same time as our meeting, which made the UTC Document Register inaccessible. My idea was to minimize the bandwidth during the meeting, in terms of opening documents from the UTC Document Register, but it turned out that it also guarded against having no bandwidth, referring to the Unicode Consortium’s server outage. I found this to be a prudent thing to do for all subsequent meetings.

We used Zoom to host the Unihan Ad Hoc meetings, which was also used to host UTC meetings. In fact, the last nine UTC meetings—from UTC #163 through UTC #171—took place as two-day virtual Zoom meetings. The preferred two days of the week quickly became Tuesday and Thursday. The one-day gap provided an opportunity to revise documents that were discussed on the first day so that they could be discussed again on the second day.

The Unihan Ad Hoc officially became the CJK & Unihan Group at the beginning of 2021. I had been the effective Chair of the Unihan Ad Hoc, and my colleague, John Jenkins, had been the effective Vice Chair. These roles carried over to the CJK & Unihan Group as official titles.

Preparing to host a CJK & Unihan Group meeting (January 14, 2022)

Preparing for each quarterly meeting takes time, as does preparing the quarterly report for the upcoming UTC meeting, but it usually means that two hours is sufficient for covering CJK- and Unihan-related topics during a UTC meeting. Sometimes it takes less time, but sometimes we need a little extra time. In other words, covering CJK- and Unihan-related topics during UTC meetings has become much more efficient due to the efforts of the CJK & Unihan Group. Although all of the Unihan Ad Hoc and CJK & Unihan Group reports can be found in the UTC Document Register, they can also be conveniently accessed here.

I would like to point out that with each subsequent CJK & Unihan Group report that I prepare, I have become increasingly better at preparing them. The idea is to provide copy/paste-ready text for any consensuses and action items, which are recorded in the UTC meeting minutes.

As a result of COVID-19, the Properties & Algorithms Group was also formed, and also experienced similar efficiencies.

The Future

In-person — or, perhaps more accurately, hybrid — UTC meetings are expected to resume this year, and are likely to be three consecutive days. Keep in mind that some regular UTC meeting attendees may be unable—due to employer policies—or unwilling to attend an in-person meeting, hence the need to host UTC meetings in hybrid fashion for the foreseeable future.

For those who previously called into UTC meetings by speaker phone, the good news is that the video conferencing system that is used by the meeting host is expected to be used instead. Zoom is not an option for hybrid meetings. This is due to the integration of a specific video conferencing system into the meeting room’s audio and video capabilities. I see this as a completely positive change, because remote meeting attendees will finally be able to share documents, and view documents that are being shared.

Furthermore, all of the groups that formed during COVID-19, along with those that already existed, are expected to continue to meet prior to each UTC meeting according to their current schedules, which will involve preparing and submitting a report for the UTC. As Chair of the CJK & Unihan Group, there is definitely no going back to the less-efficient ways.

Besides more efficiently dealing with CJK- and Unihan-related documents and public feedback on behalf of the UTC, the CJK & Unihan Group also engages with the IRG (Ideographic Research Group), and effectively serves as a liaison between the UTC and the IRG.

Lastly, in terms of my own attendance at UTC meetings going forward, as long as the location does not involve airline travel, I expect that I will attend in-person for all three days.

We shall see…

About the Author

Dr Ken Lunde has worked for Apple as a Font Developer since 2021-08-02 (and was in the same role as a contractor from 2020-01-16 through 2021-07-30), is the author of CJKV Information Processing Second Edition (O’Reilly Media, 2009), and earned BA (1987), MA (1988), and PhD (1994) degrees in linguistics from The University of Wisconsin-Madison. Prior to working at Apple, he worked at Adobe for over twenty-eight years — from 1991-07-01 to 2019-10-18 — specializing in CJKV Type Development, meaning that he architected and developed fonts for East Asian typefaces, along with the standards and specifications on which they are based. He architected and developed the Adobe-branded “Source Han” (Source Han Sans, Source Han Serif, and Source Han Mono) and Google-branded “Noto CJK” (Noto Sans CJK and Noto Serif CJK) open source Pan-CJK typeface families that were released in 2014, 2017, and 2019, and published over 300 articles on Adobe’s now-static CJK Type Blog. Ken serves as the Unicode Consortium’s IVD (Ideographic Variation Database) Registrar, attends UTC and IRG meetings, participates in the Unicode Editorial Committee, became an individual Unicode Life Member in 2018, received the 2018 Unicode Bulldog Award, was a Unicode Technical Director from 2018 to 2020, became a Vice-Chair of the Emoji Subcommittee in 2019, published UTN #43 (Unihan Database Property “kStrange) in 2020, became the Chair of the CJK & Unihan Group in 2021, and published UTN #45 (Unihan Property History) in 2022. He and his wife, Hitomi, are proud owners of a His & Hers pair of acceleration-boosted 2018 LR Dual Motor AWD Tesla Model 3 EVs.

--

--

Dr Ken Lunde

Chair, CJK & Unihan Working Group—Almaden Valley—San José—CA—USA—NW Hemisphere—Terra—Sol—Orion-Cygnus Arm—Milky Way—Local Group—Laniakea Supercluster