“Understanding Japanese Information Processing”

Dr Ken Lunde
8 min readSep 15, 2023

By Dr Ken Lunde

The front cover of “Understanding Japanese Information Processing”

Understanding a concept to the point that it becomes second nature requires a lot of focus and determination, which has led me quite far in my professional life. In other words, obsession can be a good thing.

I never imagined that I would author a book, let alone three (thus far—wink, wink). In this article, I am celebrating the 30th anniversary of my first book, Understanding Japanese Information Processing (O’Reilly & Associates, 1993), which was published 30 years ago this month, and on 1993-09-15, which is 30 years ago today, I received my first author copy. I had been working for Adobe—called Adobe Systems Incorporated at the time—for a little over two years. It was early on in my career there that sort of abruptly ended after more than 28 years. It also became one of those “So Long, and Thanks for All the Fish” opportunities. 🐬

The book’s subtitle, 日本語情報処理 (nihongo jōhō shori), is simply the Japanese translation of Japanese information processing, which is also present on the book’s spine—and on the limited-edition T-shirt, and on the extremely limited-edition white and gray sweatshirts! It was set in the Heisei Kaku Gothic W5 (平成角ゴシックW5) typeface.

The book’s cover animal, which is a blowfish 🐡, was actually my idea. My publisher—or, more specifically, Edie Freedman—usually selects the cover animal for their books, but I was able to make a good case as to why my idea should be used. Instead of repeating all of the details here, you can read all about it in my article below:

The History

Japanese, or more specifically, how Japanese text was processed on computers, was a main focus early on in my professional life, which originally began while I was an undergraduate student at UW-Madison, meaning almost 40 years ago.

This obsession eventually led to the creation and distribution of an electronic information file called JAPAN.INF that was first released sometime in 1989. Version 1.1, which is dated 1991-08-19, is still available online. Version 1.2, which was available in Shift-JIS, EUC-JP, and ISO-2022-JP encodings, was the final version, and was dated 1992-03-20.

One thing led to another, and I eventually pitched a book idea to O’Reilly & Associates sometime in late 1992. Peter Mui, who was my editor, was very helpful and supportive during the entire process. Given the special nature of the book, Tim O’Reilly, the publisher’s founder, was also directly involved and offered his support. My original estimate was 250 pages. The book ended up becoming 470 pages.

I later created a new electronic information file called CJK.INF, which was an extension of JAPAN.INF that added details about Chinese and Korean, hence the name change. Version 1.0 was released on 1995-06-09, and Version 2.1, which was its final version, was released on 1996-07-12. Its WHAT HAPPENED TO JAPAN.INF? section states the following:

Put bluntly, JAPAN.INF died. It first evolved into my first book entitled “Understanding Japanese Information Processing” (this book is now into its second printing, and the Japanese translation was just published). After my book came out, I did attempt to update JAPAN.INF, but the effort felt a bit futile. I decided that something fresh was necessary.

JAPAN.INF also evolved into this document, which breaks the Japanese barrier by providing similar information on Chinese and Korean character sets and encodings. It fills the Chinese and Korean gap, so to speak. My specialty (and hobby, believe it or not) is the field of CJK character sets and encoding systems, so I felt that shifting this document more towards those lines was appropriate use of my (copious) free time (I wish there were more than 24 hours in a day!). Besides, this document now becomes useful to a much broader audience.

Of course, CJK.INF eventually led to my second book, CJKV Information Processing (O’Reilly & Associates, 1999), which was published at the very end of 1998.

The Writing

Information—or data—is an essential key that leads to understanding. How information is organized or presented can make all of the difference in terms of how well it will be understood. This was my very first major writing endeavor, which also turned into an excellent learning experience. All in all, it took approximately nine months to write the book.

In addition to writing the book, I also typeset its 470 pages using an Apple Macintosh IIci computer running Aldus PageMaker 4.0J, and produced camera-ready copy with a Linotronic L300-J imagesetter set at 1270-dpi resolution.

There is an “Advance Praise” page at the very beginning of the book, and I would like to draw attention to the very first quote:

“Adobe Systems understands the importance of providing our Japanese-speaking customers with Japanese-capable software. The issues are complex, but Ken Lunde is able to sort them out for the reader in Understanding Japanese information Processing. I expect this book to have a great impact in the field of software internationalization and localization. I feel quite fortunate having Ken Lunde on our staff at Adobe Systems.”

— Dr. John Warnock
CEO, Adobe Systems Incorporated

For those who were not aware, John Warnock passed away on 2023-08-19 at the age of 82.

The second quote is from an old friend, best known for his book entitled PostScript by Example, who passed away on 2022-05-27 at the age of 77:

“Creating multilingual software is a challenge in the growing global marketplace, and the task is especially daunting with Japanese—one of the world’s most difficult languages. Understanding Japanese Information Processing in the resource for developers building the bridge between Japanese and Western languages.”

— Henry McGilton
Trilithon Software

That page included two additional quotes, from Dr. Jun Murai and Professor Jim Breen. The same page in the second printing, which spilled over into two pages, included three additional quotes, from Jack Halpern, Dr. Joseph Becker, and Wolfgang Hadamitzky.

Returning to the topic of writing, after my first book was published, I then focused my attention on preparing my PhD dissertation, which was entitled Prescriptive Kanji Simplification, and was successfully defended on 1994-05-18. It had five pages of front matter followed by 70 pages for the dissertation proper. I was 28 years old, the same age when the book was published.

The Code Examples

Processing of Japanese text data is easily done using today’s platforms, mainly thanks to the proliferation of the Unicode Standard, but that wasn’t the case 30 years ago. Everything was very hands-on, and dealing with encoding conversion was the norm. Perl eventually became my preferred programming language, which I used for over 25 years before recently abandoning it in favor of Python. At the time that I wrote my first book, C was self-taught and fresh in my mind, so all of the programming code, which was encapsulated in Chapter 7, Japanese Information Processing Techniques, was written in C.

Based on the previous paragraph, my deep dive into using Perl led to the two subsequent books including 20-page or so appendixes entitled Perl Code Examples: Appendix W in CJKV Information Processing, and Appendix C in CJKV Information Processing, Second Edition.

In closing, my Better Half™ (aka my wonderful and supporting wife), whom I didn’t know at the time of the book’s publishing, once told me that it completely embarrassed the entire Japanese IT industry, because a completely non-Japanese person wrote the book that a Japanese person should have written.

Fate loves irony. Particularly when said irony occurs more than once.

Continuing on the topic of irony, the Japanese translation, which was aptly entitled 『日本語情報処理』(nihongo jōhō shori), was published less than two years later, on 1995-08-25. As you can see from the book’s cover below, its subtitle is the title of the original book, but only the Japanese title is shown on its spine:

The front cover of 『日本語情報処理』

The photo below shows the spines of all of my books and their translated version, and from top to bottom they are Understanding Japanese Information Processing (1993), its Japanese translation (1995), CJKV Information Processing (1999), its Japanese and (Traditional) Chinese translations (2002), and CJKV Information Processing, Second Edition (2009):

I will be celebrating the 25th anniversary of the publishing of CJKV Information Processing (O’Reilly & Associates, 1999) in December of this year. I received my first author copy on 1998-12-17, so until then…

About the Author

Dr Ken Lunde has worked for Apple as a Font Developer since 2021-08-02 (and was in the same role as a contractor from 2020-01-16 through 2021-07-30), is the author of CJKV Information Processing Second Edition (O’Reilly Media, 2009), and earned BA (1987), MA (1988), and PhD (1994) degrees in linguistics from The University of Wisconsin-Madison. Prior to working at Apple, he worked at Adobe for over 28 years — from 1991-07-01 to 2019-10-18 — specializing in CJKV Type Development, meaning that he architected and developed fonts for East Asian typefaces, along with the standards and specifications on which they are based. He architected and developed the Adobe-branded “Source Han” (Source Han Sans, Source Han Serif, and Source Han Mono) and Google-branded “Noto CJK” (Noto Sans CJK and Noto Serif CJK) open source Pan-CJK typeface families that were released in 2014, 2017, and 2019, and published over 300 articles on Adobe’s now-static CJK Type Blog. Ken serves as the Unicode Consortium’s IVD (Ideographic Variation Database) Registrar, attends UTC and IRG meetings, participates in the Unicode Editorial Committee, became an individual Unicode Life Member in 2018, received the 2018 Unicode Bulldog Award, was a Unicode Technical Director from 2018 to 2020, became a Vice-Chair of the Emoji Subcommittee in 2019, published UTN #43 (Unihan Database Property “kStrange) in 2020, became the Chair of the CJK & Unihan Group in 2021, published UTN #45 (Unihan Property History) in 2022, and published UTN #50 (KP-Source Property Value History) in 2023. He and his wife, Hitomi, are proud owners of a His & Hers pair of acceleration-boosted 2018 LR Dual Motor AWD Tesla Model 3 EVs.

--

--

Dr Ken Lunde

Chair, CJK & Unihan Working Group—Almaden Valley—San José—CA—USA—NW Hemisphere—Terra—Sol—Orion-Cygnus Arm—Milky Way—Local Group—Laniakea Supercluster