✍️

IVS: How Unicode Represents 47 Versions of the Same Kanji

Understanding Ideographic Variation Sequences and Standardized Variation Sequences, with live font rendering of all registered variants.

The Problem: One Code Point, Many Shapes

Han Unification merged characters that share the same origin into single code points. But what happens when you need to specify an exact glyph variant? Japanese names, historical documents, and calligraphic traditions demand precise glyph control beyond what a font's default rendering provides.

For example, the character 辻 (U+8FBB, “tsuji”, a common Japanese surname) has two accepted forms: one with one dot on the left radical (一点しんにょう) and one with two dots (二点しんにょう). Both are “correct” — but which one appears depends on the font, and there is no way to choose using the base code point alone.

Unicode's solution is Variation Sequences: a base character followed by a special variation selector character that specifies the exact glyph form.

How IVS Works: The E0100 Range

Ideographic Variation Sequences (IVS) use variation selectors from the range U+E0100 through U+E01EF (240 selectors, called VS17 through VS256). An IVS is a two-character sequence:

Base character + Variation Selector = IVS

Example:
葛 (U+845B) + VS17 (U+E0100) = 葛󠄀 (specific variant)
葛 (U+845B) alone            = 葛  (default glyph)

The variation selector is invisible — it produces no glyph of its own. But a font that supports IVS will render a different glyph when it encounters the sequence.

Component	Code point	Visible?
Base character: 葛	U+845B	Yes
Variation selector: VS17	U+E0100	No (invisible)
Sequence: 葛󠄀	U+845B U+E0100	Yes (variant glyph)

In JavaScript, each variation selector in this range requires a surrogate pair (2 UTF-16 code units), so an IVS takes 3–4 code units total despite being one grapheme cluster.

Compare 葛 with and without IVS

SVS: The Emoji and Symbol Variation Selectors

Standardized Variation Sequences (SVS) use a different, smaller set of variation selectors: U+FE00 through U+FE0F (VS1 through VS16). These are used for:

Selector	Common use	Example
VS1 (U+FE00)	CJK compatibility variants	芦 + VS1 for specific form
VS15 (U+FE0E)	Text presentation	☺︎ (text style)
VS16 (U+FE0F)	Emoji presentation	☺️ (emoji style)

The most widely known SVS usage is the text/emoji toggle. Many characters have both a text presentation (monochrome, simple) and an emoji presentation (colorful). VS15 forces text style, VS16 forces emoji style:

// Same base character, different presentations:
"\u2764"           // ❤ (default, usually emoji)
"\u2764\uFE0E"    // ❤︎ (text presentation, VS15)
"\u2764\uFE0F"    // ❤️ (emoji presentation, VS16)

// The selectors are invisible but change rendering:
"❤️".length  // 2 (base + VS16, both in BMP)

Unlike IVS selectors (which are in the SMP and need surrogates), SVS selectors are in the BMP (U+FE00–FE0F) and each take just one UTF-16 code unit.

IVD Collections: Adobe-Japan1 and Moji_Joho

Which variation selectors map to which glyphs is not arbitrary — it is recorded in the Ideographic Variation Database (IVD), maintained by the Unicode Consortium. The IVD contains named collections:

Collection	Scope	Entries
Adobe-Japan1	Japanese typography (AJ1 CID)	~14,700
Moji_Joho	Japanese government character info	~11,000
Hanyo-Denshi	Japanese administrative systems	~11,000
KRName	Korean personal name variants	~2,200

Adobe-Japan1 is the most widely supported collection. It maps IVS sequences to specific CID (Character ID) numbers in the Adobe-Japan1-7 character collection, which professional Japanese fonts implement. A font that supports Adobe-Japan1 IVS can render thousands of glyph variants.

Moji_Joho (文字情報) is maintained by Japan's Information-technology Promotion Agency (IPA) and focuses on character variants used in official government documents and the family register system (戸籍).

Font Support: When IVS Actually Works

IVS only works if the font supports it. A font must contain:

The glyph variants for each supported IVS sequence
A cmap table (specifically format 14, Unicode Variation Sequences) that maps base+selector pairs to glyphs

Major fonts with IVS support include:

Font	Collection	Platform
IPAmj Mincho	Moji_Joho	Cross-platform (free)
Noto Sans CJK	Adobe-Japan1 (partial)	Cross-platform (free)
Kozuka Mincho	Adobe-Japan1	Adobe products
Yu Mincho	Adobe-Japan1 (partial)	Windows / macOS
Hiragino Mincho	Adobe-Japan1 (partial)	macOS

If a font does not support a particular IVS, it simply renders the base character's default glyph and ignores the variation selector. This is a graceful fallback — the text remains legible, just not in the specific variant requested.

Compare 辻 variants

The Record Holder: 邉 and Its 47 Variants

The character 邉 (U+9089) holds the record for the most registered IVS sequences. It has approximately 47 variant forms in the Moji_Joho collection, reflecting the many ways this character has been written in Japanese family registers over the centuries.

The surname 渡邉 (Watanabe) is notorious in Japan for having dozens of variant spellings. Municipal offices maintaining family registers need to faithfully reproduce the exact variant used in each family's records, which is why the Moji_Joho collection registers so many forms.

Character	IVS variants (Moji_Joho)	Typical use
邉 U+9089	~47	渡邉 surname variants
邊 U+908A	~30	渡邊 surname variants
辺 U+8FBA	~10	渡辺 surname variants
葛 U+845B	~8	Place names (葛飾 etc.)

This is a case where IVS is essential: without it, government systems could not accurately record the legally distinct name variants that Japanese law requires preserving.

// The 邉 character with different IVS:
"邉"                    // Default glyph
"邉\u{E0100}"          // Variant 1 (VS17)
"邉\u{E0101}"          // Variant 2 (VS18)
// ... up to ~47 registered variants

// Each is one grapheme cluster:
const seg = new Intl.Segmenter();
[...seg.segment("邉\u{E0100}")].length  // 1

Inspect variation selectors in action

🇺🇳

Han Unification: How Unicode Merged 100,000 CJK Characters

How the IRG decided which characters from Japan, China, Taiwan, and Korea are 'the same,' with a tool to check any character's source.

🎨

Why One Font Isn't Enough: CJK Variant Coverage Across Fonts

How different CJK fonts implement different IVD collections, why a single font can't show every registered variant, and how this site combines three fonts to render every IVS faithfully.

📊

JIS Levels and Kuten Codes: Japan's Character Classification System

How Japan classifies kanji into 4 levels across JIS X 0208 and JIS X 0213, with kuten positional codes.

Unicode Viewer