Common Character Database of 2003

What is it?

This public domain database contains over 90000 characters covering the major languages of the world. It is intended to be compatible with the ISO/IEC 10646:2003 standard and most web browsers. The downloadable file common-character-database-of-2003-(created-2019-05-09).zip contains this web page and two versions of the database.

Database version: common-character-database-of-2003.tsv

This tab-separated value file, with MD5 checksum bbd4e5cc26d446e765639ed5295d1340, has the following fields:

Database version: common-character-database-of-2003-without-embedded-glyphs.tsv

This file, with MD5 checksum 2a829abe4734d4687dbae4f9aa57ac29, is the same as the version above without the first field.

Notes and Exceptions

Why, when, and how was it made?

In 2018 I was unable to locate a character database that covered the major writing systems of the world, was compatible with most web browsers and had no legal restrictions. Since the facts in books are not usually subject to copyright or other laws, a book seemed to be a good source of data for a new database. And in case the European Database Directive would apply, the book should be at least fifteen years old. So beginning in 2019 I created this database using a book published in 2003 (ISBN 0321185781). Part of the data was generated with custom programs while the rest was manually entered. None came from the CDROM included with the book.

How has it changed over time?

Questions or Comments

Contact information


Numerous scribes developed the writing systems of the world. I am grateful to them and to the various national, international and commercial groups that have organized and published this information, and to the authors of ISBN 0321185781 (not named to avoid using a trademark) for a very clear and helpful reference resource. Thanks most of all to my father in heaven, the creator of everything, for providing me the ability, resources and motivation to undertake this work.

Public Domain Dedication

I dedicate this version of the "Common Character Database of 2003", created May 9, 2019, to the public domain.  --Scot Doyle