Make your own free website on Tripod.com
Google
 

Unicode (UTF) Character Sets

This is my Unicode Testing Area, where you can try to test/view (in an ordered form) almost all characters defined in Unicode 5.1 (plus some preview characters as drafted in the upcoming Unicode 5.2), the universal character encoding standard now used by thousands of Internet sites around the world. Please note that in order to view all characters, you’ll need a lot of different Unicode-compliant fonts spanning almost every defined Unicode 5.1 range; the ones I recommend at least for Windows are listed below (although you may feel free to choose your own fonts or mix my favorite fonts with other ones as long as you still end up being able to see all defined characters). Also note that I highly recommend using any of the latest internet browsers: Internet Explorer 7.0, Mozilla Firefox, Netscape 8.0 Browser or higher versions (no longer supported, though, as Netscape officially discontinued support), and Apple Safari, to name a few. IE7 is the only Microsoft browser that fully displays extraplanar characters; Firefox automatically detects which fonts are installed on your system that can display the characters on the screen (no need to specify default fonts); Netscape 8.x combines the best of both IE & Firefox in one single package (actually, it only installs the Firefox engine; it relies on Windows’ own installed IE engine whenever necessary).

Unicode Transformations

Unicode involves four distinct transformations: a 7-bit mail-safe one (now deprecated, since most modern mail servers are 8-bit), a standard 8-bit Web-safe one, a 16-bit version (either Little-Endian, Big-Endian, or BOM-determined), and a 32-bit version (also Little-Endian, Big-Endian, or BOM-determined).
The 7-bit UTF-7 uses Base64 strings (beginning with byte 0x2B [+] and ending with byte 0x2D [-]) to generate all characters & controls defined in Unicode.
The 8-bit UTF-8 follows US-ASCII characters (bytes 0x00-7F) with a series of double-byte (16-bit), triple-byte (24-bit), & quadruple-byte (32-bit) characters using bytes 0xC2-DF as 16-bit headers, 0xE0-EF as 24-bit headers, 0xF0-F4 as 32-bit headers, and 0x80-BF as both 16-bit trailers and 24/32-bit middles & trailers.
The 16-bit UTF-16 uses 16-bit characters ranging from 0x0000-D7FF & 0xE000-FFFF (yes, two bytes per each single character among 63,488 possible ones) and several 32-bit characters generated by pairing high surrogates (from range 0xD800-DBFF) with low surrogates (from range 0xDC00-DFFF); the 32-bit characters are commonly called surrogate pairs or extraplanar characters because they fall outside the ordinary 16-bit plane (now called Plane 0 or Basic Multilingual Plane). For the BOM-determined version, a special control code [0xFEFF] known as Byte Order Mark or BOM, along with its byte-wise counter-opposite code [0xFFFE], determines the byte order: if the byte stream reads 0xFF·0xFE, then it’s Little Endian (LE); if it reads 0xFE·0xFF, then it’s Big Endian (BE).
And the new 32-bit UTF-32 uses pure 32-bit characters currently ranging from 0x00000000-0010FFFF, divided between 17 planes, each plane containing 65,536 characters (for a total of a whopping 1,114,112 characters!). For the BOM-determined variant, 0xFF·0xFE·0x00·0x00 in the initial byte stream determines LE while 0x00·0x00·0xFE·0xFF determines BE.

Surrogate Issues

If you use Windows 2000, you may need to activate surrogates in order to view extraplanar characters at least in Notepad and Microsoft Word. According to Carl W. Brown (thankee!), http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_192r.asp tells how to turn on surrogates for Win2000.
Code2001’s author, James Kass, tells in the Code2001 page that Internet Explorer versions up to 6.0 SP1 do not display extraplanars (other than Plane-2) when using UTF-16 surrogate pairs or extraplanar UTF-8 headers. Andrew “Bass” Shcheglov also claims that when using character entities (NCRs, which look like  or ), Internet Explorer can correctly display such extraplanars, but then that would not be making any use of Unicode at all.

Unicode Test Pages

As of April 29, 2008, complete test pages for all Unicode 5.1 planes for both UTF-8 and UTF-16 have been uploaded to this site. Individual UTF-7 characters are trickier to be displayed in HTML format because some of them may require lots of bytes to be generated, precious Web space could be wasted, and UTF-7’s encoding method is not so intuitive, so don’t expect to see UTF-7 test pages here displaying all individual characters. (Anyway, UTF-7 is no longer officially supported or recommended by the Unicode Consortium, as it is no longer mentioned in the Unicode Standard and any of its ammendments.) No UTF-32 test page is available because, apart from taking a lot of space (4 bytes for each character, including those used only for the HTML formatting) for each page, Windows applications currently do not support creating, editing, or saving UTF-32 text files, so I can’t really tell if Internet Explorer can handle UTF-32 (Firefox and Netscape claim to support UTF-32, though, but still with no means to save UTF-32 files, I can’t prove such claims).

Transformation Format Test PageMajor MIME Charset NamesWindows CodepageSupported Browsers
8-bit (Web-Safe)ISO-10646-UTF-1, UTF-865001Internet Explorer 5.5+, Netscape 6.0+, Mozilla Firefox, Apple Safari, Opera Browser
16-bit (BOM-determined Little-Endian)ISO-10646-UCS-2, Unicode (LE), UnicodeFFFE (BE), UTF-16, UTF-16LE, UTF-16BE1200 (LE), 1201 (BE)Internet Explorer 5.5+, Netscape 6.0+, Mozilla Firefox, Apple Safari, Opera Browser

Even though the above versions of Internet Explorer & Netscape Navigator support Unicode, their latest versions, 7.0 & 8.x respectively, have better support for all characters & controls defined in the latest version of Unicode, 5.1, so I recommend you these or later versions.

Scripts & Characters Supported

As of April 29, 2008, these pages above can display or handle the following scripts or subsets:

Support for Balinese [U+1B00-1B7F], Sundanese [U+1B80-1BBF], & Lepcha [U+1C00-1C4F] on these pages is not yet available as of the above date, because there are no TrueType/OpenType fonts available out there that contain such ranges.

Unicode 5.2 (whatever couldn’t be approved for 5.1)

As this blog entry from BabelStone states, Unicode 5.2 is now in the works. It won’t be released until Summer 2009, but so far these new scripts have been proposed (those marked with an * have been approved so far):

These extensions to existing scripts have also been proposed (and those marked with * have been accepted): And additions to the following existing ranges have also been proposed (and again those marked with * have been accepted): If you have the Sun-ExtB font installed, then you’ll see the complete CJKV Ideograph Extension C repertoire as it was originally to be defined in Unicode 5.1 before the glyph unification/disunification scandal arised by mistakes/controversies found in Extensions B & C forced it to be sent back for revisions and thus delayed to Unicode 5.2 (on which case, it might not look the same as it looks now in Sun-ExtB).

Private Use Areas

Private Use Areas are used only to allocate logos, proprietary symbols, intermediate glyphs, & other placeholder characters that cannot be allocated under any of the standard areas listed above. They have also been used by private organizations outside Unicode to define fantasy or preliminary scripts as specified by the ConScript Unicode Registry. For that reason, the characters in these areas will look different depending on the font used.
Planes 15 & 16 have been reserved as Supplementary Private Use Areas A & B, respectively, for holding extraplanar private characters. So far, only Code2001 has some private characters in Plane 15.
As of June 30, 2007, all Private Use Area test pages feature a JavaScript code that allows you to choose the font with which to display the private characters. These test pages can be accessed from the Unicode Test Pages above.

Unicode Fonts Most Suitable for Test Pages

These fonts are recommended either because they are cleaner-looking than other fonts out there, or because they are more complete than other fonts in terms of the Unicode ranges they offer. Since my computer is Windows-based and all my experience with Unicode fonts has been on Windows, I can’t provide a lot of help for Mac OS X users for now.
Font TypefaceLatest VersionHow to Get ItAdditional Comments
Aegean1.01Unicode Fonts for Ancient ScriptsCovers Latin, Greek, Coptic, Linear B, Aegean Numbers, Ancient Greek Numbers, Ancient Symbols, Phaistos Disk, Lycian, Carian, Old Italic, Ugaritic, Old Persian, Cypriot, Phoenician, Lydian, & Ancient Greek Musical Symbols.
Akkadian1.01Unicode Fonts for Ancient ScriptsCovers Latin, Greek, Coptic, & Cuneiform. So far the only font that fully covers Cuneiform.
Arabic Typesetting5.00Included with Windows Vista; older version included as part of Microsoft VOLT Supplemental Files from the Microsoft VOLT User CommunityThis font contains several Arabic Presentation ligatures not covered by Segoe UI or Times New Roman. Internet Explorer somehow automatically detects the presence of this font when Segoe UI (or Majalla UI, if it does exist) is used as the default Arabic font.
Arial5.01Included with Windows Vista; also part of Microsoft’s European Union Expansion Font Update for Windows XP & 20035.01 includes Unicode 5.0 glyphs for Pan-Euro (Latin, Greek, Cyrillic), Hebrew, & Arabic. I use this font as my browser default for displaying Hebrew. You can substitute Times New Roman (version 5.01, also included with Windows Vista & European Union Expansion Font Update) for Arial and still be able to see all Unicode 5.0 glyphs for the scripts mentioned above.
Arial Unicode MS1.01Included with Microsoft Office 2003, & 2007Includes glyphs defined in Unicode 2.1: Pan-Euro (not as complete as Arial, Charis/Doulos SIL, or Segoe UI), Armenian, Hebrew, Arabic (not as complete as Segoe UI or any of the other new Persian/Urdu/Sindhi/etc. fonts out there), Indic scripts (except Sinhala), Thai, Lao, Tibetan, Georgian, Symbols, & CJKV (minus radicals/Extension-A/Yijing). Personally, I think this font has been neglected by Microsoft and is overdue for an update to Unicode 5.0. Code2000 can be substituted for this font and is 5.0-compliant. However, some glyphs are rendered better with Arial Unicode MS than with Code2000. I use this font merely for displaying Georgian capital letters (Asomtavruli, not defined in Sylfaen), some punctuation marks, many symbols which by default are poorly rendered to me with classic CJK fonts like PMingLiU & MS PMincho, a compatibility ideograph missing from other good-looking CJK fonts, and some Arabic presentation ligatures not present in other Arabic fonts.
BabelStone Phags-pa Book1.01BabelStone FontsSo far one of only four TrueType fonts containing Phags-pa glyphs as defined in Unicode 5.0, and the most modern-looking and the cleanest-looking one. The other three fonts are also available on the same BabelStone URL.
Baybayin Lopez-Baybayin FontsOne of only five TrueType fonts available with Tagalog glyphs (the other four fonts can be also downloaded from the same Baybayin URL). These five fonts, however, are not strictly Unicode-standard-compliant: although the Tagalog characters are properly located within the Tagalog range [U+1700-171F] defined by Unicode 4.1, duplicates of these characters, along with some proprietary symbols not defined in Unicode, also replace the Basic Latin range [U+0021-007E], and that’s not a proper design conduct for Unicode fonts (you’re not supposed to manipulate Unicode-defined ranges at your own will to insert your own characters there, especially proprietary symbols; that’s why a Private Use Area [U+E000-F8FF] was set up by the Unicode Consortium since version 2.1 to serve proprietary purposes).
Cambria Math5.00Included with Windows Vista & Microsoft Office 2007Includes almost all (but still not all) math-related symbols (letterlike symbols, arrows, math operators, etc.) defined in Unicode 5.0, including several symbols in Plane 1 [U+1D400-1D7FF]. Internet Explorer 7.0 on Windows Vista somehow automatically loads this font whenever math/arrow symbols are encountered in a text portion whose default or specified font doesn’t contain such symbols (exactly the same behavior that occurs when Japanese characters are encountered in a text portion specified to use a non-CJK font, provided that IE has an assigned default font for Japanese). Mozilla Firefox (and Netscape Browser in Firefox mode), on the other hand, has problems displaying this font, thus choosing instead to use alternate fonts to display the intended characters. Code2000 provides some Plane-0 math symbols not defined in Cambria Math, while Code2001 provides additional Plane-1 math symbols missing from Cambria Math. Unicode Symbols, on the other hand, provides all math-related symbols missing from Cambria Math, Code2000, & Code2001.
Charis SIL, Doulos SIL4.100Charis SIL Font Home; Doulos SIL Font HomeBoth fonts cover all of Unicode 5.0’s Latin ranges as of version 4.100. They also cover the Modifier Tone Letters range [U+A700-A71F] overlooked by Windows Vista’s latest core fonts (Arial, Segoe UI most notably).
Code20001.17Shareware, just US$5.00 via PayPal: Code2000.netIntended to be a fully-fledged Unicode 5.0 Plane-0 font, this font spans Pan-Euro scripts, Armenian, Bi-Directional scripts (not good at Arabic, however), N’Ko, Indic scripts (except Sinhala), Thai, Lao, Myanmar, Georgian, Ethiopic, Cherokee, UCAS, Ogham, Runic, Buhid (a Code2000 exclusive), Khmer, Mongolian, Limbu, Buginese, Ol Chiki (another Code2000 exclusive), Symbols, Dingbats, Braille, Coptic, Tifinagh, CJKV, Yi, Vai, Saurashtra, Kayah Li, Rejang, & Cham (the last five also exclusive to Code2000, for now). It adds all CJK Radicals, Yijing Hexagrams, and some unified & compatibility ideographs missing from Microsoft-branded CJK fonts. It provides all Misc Technical symbols, all Box Drawing & Block Elements, all Geometric Shapes, all Misc Symbols, all Dingbats, and all Math & Arrow Symbols. The big downfall of this font is that it’s not good for small font sizes; actually, it looks illegible at first glance when viewed in Character Map. But at medium or large font sizes, it’s quite OK if you tolerate serif fonts. CJKV ideographs are horrible except at large or very large font sizes. I use Code2000 as my IE default for displaying N’Ko, Ogham, Runic, & Braille, as my recommendation for Limbu, Buginese, Box/Block/Geometric Symbols, Dingbats, Misc Math-B, Supplemental Math Operators, Misc Symbols/Arrows, Coptic, & Tifinagh, and as my only option for Buhid, Ol Chiki, Vai, Saurashtra, Kayah Li, Rejang, & Cham. I also use this font to display all Indic chars missing from Microsoft fonts (except Tibetan).
Code20010.919Code2000.netIts aim is to cover Plane-1’s Linear B, Aegean Numbers, Phaistos Disk, Old Italic, Gothic, Ugaritic, Old Persian, Deseret, Shavian, Osmanya, Cypriot, Phoenician, Tai Xuan Jing, Counting Rods, & Math Alphanumeric Symbols. It also has some partial support for 77 Musical Symbols (including 33 Byzantine symbols). Only two symbols from the Math Alphanumeric range are missing from this font, making it a far-more-complete font than Cambria Math in terms of Plane-1 support (Unicode Symbols still wins here). It is my recommendation for Linear B, Aegean Numbers, Phaistos Disk, Old Italic, Gothic, Ugaritic, Old Persian, Deseret, Shavian, Osmanya, Cypriot, Phoenician, & Counting Rods.
DokChampa5.00Included with Windows VistaIntended mainly for displaying Lao, although it also adds Thai. Definitely better-looking than Arial Unicode MS or Code2000 for displaying Lao & Thai. Comparable to Leelawadee & Tahoma for Thai display. My browser default for Lao.
Estrangelo Edessa5.00Included with Windows Vista; older versions supplied with Windows XP & 2003My default font for Syriac. Code2000 & MPH 2B Damase also display Syriac.
Euphemia, Euphemia UCAS5.00 (Euphemia)Euphemia included with Windows Vista; Euphemia UCAS bundled with Mac OS XBoth fonts are essentially the same fonts with two different names; they provide the complete Canadian Aboriginal (UCAS) range as defined in Unicode 5.0. Code2000 also covers UCAS.
Gautami5.00Included with Windows Vista; older version supplied with Windows XP & 2003My default for Telugu. Much cleaner than Arial Unicode MS’s Telugu range.
Iskoola Pota5.00Included with Windows VistaMy default for Sinhala. Evolved from Potha, a prototype font that may be still available for download from the Internet.
Kalinga5.00Included with Windows VistaMy default for Oriya. Much cleaner than Arial Unicode MS’s Oriya range.
Kartika5.00Included with Windows Vista; older version supplied with Windows XP SP2 & 2003 SP1My default for Malayalam. Much cleaner than Arial Unicode MS’s Malayalam range.
Latha5.00Included with Windows Vista; older versions supplied with Windows 2000, XP, & 2003My default for Tamil. Older versions do not cover some additional letters.
Lucida Sans Unicode5.00Included with Windows Vista; older versions supplied with Windows 2000, XP, & 2003This was supposed to be a Unicode 1.0 font; however it only covered Pan-Euro scripts and symbols. My only use for this font is to display Control Pictures (although it doesn’t include two pics which are covered by Code2000 & Unicode Symbols). Segoe UI automatically refers to this font when displaying Control Pics.
Majalla UI5.00Probably included with Arabic-based-language versions of Windows VistaThis font is often mentioned in discussions about Segoe UI (indeed, if you type “Majalla UI” at a Google search box, most of the results mainly refer to Segoe UI). I dare to state my own hypothesis here: originally during development of Windows Vista, Segoe UI and Majalla UI were to be separate fonts, with Segoe covering solely Pan-Euro and Majalla covering solely Pan-Arabic. But then, at least for American- & European-language versions of Vista, it was decided to merge both fonts into a single one bearing the facename “Segoe UI”. I think users speaking Arabic-based languages would prefer an Arabic facename for their localized Vista versions’ main UI font, and that leads me to believe that they got Majalla UI instead of Segoe UI for Pan-Arabic script. Anyway, if such font does indeed exist, then it must cover at least the same Pan-Arabic characters covered by Segoe UI in America & Europe (and practically the same glyph appearance).
Malgun Gothic (맑은 고딕)5.00Included with Windows VistaMicrosoft’s latest Korean font (and now my browser default), it still doesn’t cover four enclosed Hangul letters; otherwise it’s more complete and better-looking than classic Korean fonts like Gulim (굴림) or Dotum (돋움), which are supplied with Windows 2000 & higher. Please note that, unlike classic Korean fonts, this font does not include the CJKV ideographs traditionally used in Korean (not even the KS C 5601 compatibility ideographs).
Mangal5.00Included with Windows Vista; older versions supplied with Windows 2000, XP, & 2003My default for Devanagari. Older versions do not cover some additional letters.
Meiryo (メイリオ)5.00Included with Windows VistaDefinitely the best Japanese font ever released by Microsoft (and now my browser default). Please note that it only covers those CJKV ideographs from Plane 0 specifically used in Japan (either today or historically), and that those ideographs shared with Chinese & Vietnamese will appear in the preferred Japanese style. It also misses some now-rarely-used CJK compatibility symbols covered by Arial Unicode MS and classic Japanese fonts like MS PGothic (MS Pゴシック)/MS PMincho (MS P明朝) supplied with Windows 2000 & higher (in Vista, version 5.00 of all Japanese fonts adds CJK Extension-A ideographs used in Japan).
Microsoft Himalaya5.00Included with Windows VistaMy IE default for Tibetan. Still, eight Unicode 5.0 Tibetan characters are not yet covered.
Microsoft JhengHei (微軟正黑體)5.00Included with Windows VistaStylistically intended for Traditional Chinese (TC), its nearly-complete Plane-0 CJKV-ideograph repertoire (both CJK Extension-A & original CJK Unified) also makes it somewhat appropriate for Simplified Chinese, Japanese, and ideographic Vietnamese [that is, ignoring glyph/stylistic differences between these languages]. It doesn’t, however, cover two Bopomofo letters (one supplied by Arial Unicode MS, the other one supplied by Code2000), the CJK Strokes included in version 6.03 of classic TC fonts like PMingLiU, most of the supplementary radicals completely covered by Code2000 & Sun-ExtA, the HKSCS additions [U+9FA6-9FB3] to the CJK Unified range (covered by version 5.00 of classic TC fonts), and the GB18030 additions [U+9FB4-9FBB] to CJK Unified (covered by Microsoft YaHei & version 5.00 of classic SC font SimSun). For Plane-2 CJKV ideographs (the CJK Unified Ideograph Extension B), you’ll still need PMingLiU-ExtB (with TC glyphs for unified ideographs), also supplied with Windows Vista; these fonts still don’t cover all Plane-2 compatibility ideographs (the CJK Compatibility Ideographs Supplement, which is fully covered by Sun-ExtB).
Microsoft YaHei (微软雅黑)5.00Included with Windows VistaStylistically intended for Simplified Chinese (SC), its nearly-complete Plane-0 CJKV-ideograph repertoire also makes it somewhat appropriate for Traditional Chinese, Japanese, & ideographic Vietnamese [that is, ignoring glyph/stylistic differences between these languages]. It doesn’t, however, cover most of the supplementary radicals completely covered by Code2000, the HKSCS additions [U+9FA6-9FB3] to the CJK Unified range (covered by version 6.03 of classic TC fonts like PMingLiU), and a GB18030-originated ideograph [U+9FBA] covered by version 5.00 of classic SC font SimSun. For Plane-2 CJKV ideographs (the CJK Unified Ideograph Extension B), you’ll still need SimSun-ExtB (with SC glyphs for unified ideographs), also supplied with Windows Vista; these fonts still don’t cover all Plane-2 compatibility ideographs (the CJK Compatibility Ideographs Supplement, which is fully covered by Sun-ExtB).
Microsoft Yi Baiti5.00Included with Windows VistaMy IE default for Yi. Code2000 somewhat comparable.
Mongolian Baiti5.00Included with Windows VistaFully covers Mongolian script (except for one letter). For Windows 2000, XP, & 2003, Code2000 would be the best option.
MoolBoran5.00Included with Windows VistaAlthough it’s (along with DaunPenh) the most complete Khmer font for Windows as far as I know, its glyphs are rather too small compared to other Khmer fonts out there. If glyph size is your concern, then Code2000 is your best option (and, like MoolBoran, also covers Khmer Symbols).
MPH 2B Damase2.000WAZU JAPAN’s Gallery of Unicode FontsThis font, like Code2000, covers Pan-Euro, Armenian, Hebrew, Arabic, Thaana, Georgian (capitals only), Cherokee, Limbu, Buginese, Coptic, & Tifinagh. Unlike Code2000, it also features Tai Le, some symbols missing from other fonts listed here, Glagolitic, & the Georgian Supplement (Khutsuri). Its latest version also introduces Hanunóo & Syloti Nagri as Damase exclusives. It also covers Old Italic, Gothic, Ugaritic, Old Persian, Deseret, Shavian, Osmanya, Cypriot, Phoenician, Kharoshthi, and portions of Linear B (Code2001 still wins here) on Plane 1. It is my recommendation for Tai Le, Glagolitic, Georgian Supplement, & Kharoshthi and the only Windows option for Hanunóo & Syloti Nagri.
Musical Symbols1.01Unicode Fonts for Ancient ScriptsCovers Latin, Greek, Coptic, & Musical Symbols (Byzantine, Western, & Ancient Greek). So far the only font that fully covers Musical Symbols.
MV Boli5.00Included with Windows Vista; older version supplied with Windows XP & 2003My IE default for Thaana. Code2000 also works as well.
Nyala5.00Included with Windows VistaMy IE default for Ethiopic; it also includes Ethiopic Supplement. Code2000 also works as well.
Padauk2.2Padauk Graphite Font HomeSo far the best-looking Myanmar font that complies with the encoding model as completely revised on Unicode 5.1.
PakType Tehreer1.3PakType - Pakistani TypographyThis font contains Arabic presentation super-ligatures – those found on U+FDF0-FDFD. PakType Naqsh, also downloadable from the same site, also does the same as Tehreer.
Plantagenet Cherokee5.00Included with Windows Vista; older version included with Mac OS XMy IE default for Cherokee. Code2000 & MPH 2B Damase also cover Cherokee.
PMingLiU (新細明體)6.02Included with Windows Vista; older versions supplied with Windows 2000, XP, & 2003Though not the best-looking Plane-0 Traditional Chinese font, it is required to display the HKSCS-specific unified ideographs [U+9FA6-9FB3] and the CJK Strokes [U+31C0-31CF], all missing from Microsoft JhengHei & Arial Unicode MS. The same missing characters can also be found on MingLiU_HKSCS (細明體_HKSCS), a similar font also included with Windows Vista (except that it’s more specific to Hong Kong styling and adds HKSCS-compatibility ideographs, strokes, letters, kana, & symbols to Private Use Area, specifically ranges U+E000-EEB7 & U+F303-F848).
PMingLiU-ExtB (新細明體-ExtB)5.00Included with Windows VistaThis font covers the Plane-2 CJKV unified ideographs (the CJK Unified Ideograph Extension B, with TC-specific glyphs for unified ideographs) not covered by modern-style Microsoft TC font Microsoft JhengHei. It doesn’t, however, fully cover the Plane-2 compatibility ideographs (the CJK Compatibility Ideographs Supplement), which are fully covered by Sun-ExtB.
Raavi5.00Included with Windows Vista; older version supplied with Windows XP & 2003My default for Gurmukhi.
Segoe UI5.00Included with Windows Vista & Microsoft Office 2007One of the most complete Pan-Euro fonts out there in the Windows community, and also [for English-language versions of Windows] one that features the complete [Unicode 5.0] Pan-Arabic repertoire (minus presentation ligatures). It’s my default for Latin, Greek, Cyrillic, & Arabic.
Shruti5.00Included with Windows Vista; older version supplied with Windows XP & 2003My default for Gujarati.
SimSun (宋体)5.00Included with Windows Vista; older versions supplied with Windows 2000, XP, & 2003Though not the best-looking Simplified Chinese font, it is required to display a GB18030-specific ideograph [U+9FBA] missing from Microsoft YaHei.
SimSun-ExtB5.00Included with Windows VistaThis font covers the Plane-2 CJKV unified ideographs (the CJK Unified Ideograph Extension B, with SC-specific glyphs for unified ideographs) not covered by modern-style Microsoft SC font Microsoft YaHei. It doesn’t, however, fully cover the Plane-2 compatibility ideographs (the CJK Compatibility Ideographs Supplement), which are fully covered by Sun-ExtB.
Sun-ExtA5.01海峰五笔・超大字符集・标准通用版・免费下载A nearly-complete CJKV font, also covering Yi. Apart from providing all Plane-0 CJKV ideographs (Extension-A & Unified, including HKSCS & GB 18030 additions), this font provides all CJK Radicals, all CJK Strokes (including Unicode 5.1 additions), Yijing Hexagrams, and all Plane-0 CJK Ideographs (minus Unicode 5.1 additions & U+F907, practically a duplicate of U+F908), including all KPS 10721 compatibility additions. It is my font recommendation for Supplementary Radicals, CJK Strokes, and Yijing (including the digrams & trigrams featured in the Miscellaneous Symbols range).
Sun-ExtB5.01海峰五笔・超大字符集・标准通用版・免费下载Apart from providing all Unicode 5.0 Plane-2 CJKV ideographs (Extension-B) for which it was designed, this font covers Tai Xuan Jing (or Yijing Tetragrams) in Plane 1 and all Plane-2 CJK Compatibility Ideographs (the Compatibility Ideographs Supplement), making it the only complete Unicode 5.0 Plane-2 font so far. It is my recommendation for Tai Xuan Jing and the Compat. Ideographs Supplement. As a bonus, it also covers the Unified Ideograph Extension C as it was supposed to look on Unicode 5.1 before the unification/disunification scandal moved it to Unicode 5.2.
Sylfaen5.00Included with Windows Vista; older versions supplied with Windows 2000, XP, & 2003My default for modern Georgian, although it doesn’t include the Asomtavruli covered by Arial Unicode MS and the Nuskhuri provided by MPH 2B Damase.
Tagbanwa1.000Tagbanwa fontThe one and only TrueType font that can display Tagbanwa.
Tahoma5.00Included with Windows VistaOne of the most complete Pan-Euro fonts out there in the Windows community, and also one that features complete Hebrew, Pan-Arabic (Unicode 5.0 only, minus presentation ligatures), & Thai. It’s my default for Thai.
Tunga5.00Included with Windows Vista; older version supplied with Windows XP & 2003My IE default for Kannada and the most complete one, though it still misses four characters which are covered by Code2000.
Unicode Symbols1.01Unicode Fonts for Ancient ScriptsCovers Pan-Euro (including several Unicode 5.1 additions for Greek & Cyrillic), all defined Symbols (Combining Marks, Letterlikes, Number Forms, Arrows, Math, Technicals, Miscellaneous, Dingbats, Yijing, Tai Xuan Jing, & Counting Rods, plus Mahjong/Domino Tiles in Plane 1). I use it to display most Combining Symbol Marks, all Technical Symbols, all Misc Symbols (including Unicode 5.1), all Misc Math-A Symbols (including Unicode 5.1), & some holes left by Arial Unicode MS, Cambria Math, Code2001, & MPH 2B Damase in Punctuation and Math Alphanum Symbols.
Urdu Nastaliq Unicode1.0Urdu Nastaliq UnicodeAdditional Arabic Presentation ligatures.
Vrinda5.00Included with Windows Vista; older version supplied with Windows XP SP2 & 2003 SP1My IE default for Bengali.
Additional Unicode fonts may also be found through these two very useful font references: Alan Wood’s Unicode Resources and WAZU JAPAN’s Gallery of Unicode Fonts.

Further Information on Unicode

For further help on Unicode, please visit the Unicode Consortium.

Contact Me

For feedback related to this website, Leroy Vargas can be contacted through his Lycos address.