Common Character Database of 2003

What is it?

This public domain database contains over 90000 characters covering the major languages of the world. It is intended to be compatible with the ISO 10646:2003 standard and most web browsers. The downloadable file common-character-database-of-2003-(created-2019-03-05).zip contains this web page and two versions of the database.

Database version: common-character-database-of-2003.tsv

This tab-separated value file contains one character entry per line with these fields and values. The MD5 checksum is 8c972bf302e95f0f46f598071765d0d2.

Database version: common-character-database-of-2003-without-embedded-characters.tsv

This file is the same as the version above withouth the first field. The MD5 checksum for this version is 2584b64867bbbc5fbc969f796073c061.

Notes and Exceptions

Why, when, and how was it made?

In 2018 I was unable to locate a character database for use in another project that covered the major writing systems of the world, was compatible with most web browsers and had no legal restrictions. Since the facts in books are not usually subject to copyright or other laws, a book seemed to be a good source of data for a new database. And in case the European Database Directive would apply, the book should be at least fifteen years old. So beginning in 2019 I created this database from scratch using a book published in 2003 (ISBN 0321185781). Part of the data was generated with custom programs while the rest was hand entered. None came from the CDROM included with the book.

Questions or Comments

Contact information


Numerous scribes developed the writing systems of the world. I am grateful to them and to the various national, international and commercial groups that have organized and published this information, and to the authors of ISBN 0321185781 (not named to avoid using a trademark) for a very clear and helpful reference resource. Thanks most of all to my father in heaven, the creator of everything, for providing me the ability, resources and motivation to undertake this work.

Public Domain Dedication

I dedicate this version of the "Common Character Database of 2003" to the public domain on March 5, 2019.  --Scot Doyle