Ruby character

Ruby characters (ルビ) are small, annotative glosses that can be placed above or to the right of a Chinese character when writing languages with logographic characters such as Chinese or Japanese to show the pronunciation. Typically called just ruby or rubi, such annotations are usually used as a pronunciation guide for relatively obscure characters.

Examples
Here is an example of Japanese ruby characters (called furigana) for Tokyo ("東京"):

Note: The font size is increased to show details.

Most furigana (Japanese ruby characters) are written with the hiragana syllabary, but katakana and romaji are also occasionally used. Alternatively, sometimes foreign words (usually English) are printed with furigana implying the meaning, and vice-versa. Textbooks usually write on-readings with katakana and kun-readings with hiragana.

Here is an example of the Chinese ruby characters for Beijing ("北京"):

In Taiwan, the syllabary used for Chinese ruby characters is Zhuyin Fuhao (also known as BoPoMoFo); in mainland China Hanyu Pinyin is used. Typically, zhuyin is used with a vertical traditional writing and zhuyin is written on the right side of the characters. In mainland China, horizontal script is used and ruby characters (pinyin) are written below the Chinese characters.

Books with phonetic guides are popular with children and foreigners learning Chinese (especially pinyin).

Uses of ruby
Ruby may be used for different reasons:


 * because the character is rare and the pronunciation unknown to many—personal name characters often fall into this category;
 * because the character has more than one pronunciation, and the context is insufficient to determine which to use;
 * because the intended readers of the text are still learning the language and are not expected to always know the pronunciation and/or meaning of a term;
 * because the author is using a nonstandard pronunciation for the characters—for example, comic books often employ ruby to emphasize dajare puns.

Also, ruby may be used to show the meaning, rather than pronunciation, of a possibly-unfamiliar (usually foreign) or slang word. This is generally used with spoken dialogue and applies only to Japanese publications. The most common form of ruby is called furigana or yomigana and is found in Japanese instructional books, newspapers, comics and books for children.

In Japanese, certain characters, such as the sokuon (促音) (っ) that indicates a pause before the consonant it precedes, are normally written at about half the size of normal characters. When written as ruby, such characters are usually the same size as other ruby characters. Advancements in technology now allow certain characters to render accurately.

In Chinese, the practice of providing phonetic cues via ruby is rare, but does occur systematically in grade-school level text books or dictionaries. The Chinese have no special name for this practice, as it is not as widespread as in Japan. In Taiwan, it is known as Zhuyin, from the name of the phonetic system employed for this purpose there. It is virtually always used vertically, because publications are normally in a vertical format, and Zhuyin is not as easy to read when presented horizontally. Where Zhuyin is not used, other Chinese phonetic systems like Hanyu Pinyin are employed.

Ruby characters are not usually used for word-for-word translations between languages, even for identical traditional Chinese characters, since all natural languages include idioms (where combinations of words have a different meaning than the individual words), the relationship of non-adjacent words is often hard to capture, and usually there is no exact and unique translation for a given word. There are also challenges if the original and translated languages have a different direction (e.g., English reads left to right, but Hebrew reads right to left). A common example of this use involves the Christian bible, which was originally written in Koine Greek, Hebrew, and some Aramaic. Only a small percentage of people can read these original languages proficiently. Thus, many publications of the Christian bible in its original languages incorporate ruby text with word-by-word translations to another language, such as English, as an aid. Such documents are often termed interlinear documents (where the emphasis is on providing translated text "between the lines"), and often they also include a separate full translation of the text, rather than only using ruby characters, but, again, there are exceptions.

Ruby annotation can also be used in handwriting.

History


In British typography, ruby was originally the name for type with a height of 5.5 points, used for interlinear annotations in printed documents. In Japanese, rather than referring to the name of a font size, the word came to refer to typeset furigana. When transliterated back into English, the word was rendered in some texts as "rubi" (the typical romanization of the Japanese word ルビ). However, the spelling "ruby" has become more common since a W3C recommendation for ruby markup was published.

In the U.S., it had been called "agate" at least before the 1950s:

Ruby in Unicode
Unicode and its companion standard, the Universal Character Set, support ruby via these interlinear annotation characters:
 * Code point  (hex)—Interlinear annotation anchor—marks start of annotated text
 * Code point  (hex)—Interlinear annotation separator—marks start of annotating character(s)
 * Code point  (hex)—Interlinear annotation terminator—marks end of annotated text

Unicode Technical Report #20 clarifies that these characters are not intended to be exposed to users of markup languages and software applications. It suggests that ruby markup be used instead, where appropriate.

Ruby in ANSI
ISO/IEC 6429 (also known as ECMA-48) which defines the ANSI escape codes also provided a mechanism for ruby text for use by text terminals. The PARALLEL TEXTS (PTX) escape code accepted six parameter values giving the following escape sequences for marking ruby text:
 * (or simply  since 0 is used as the default value for this control) &mdash; end of parallel texts
 * &mdash; beginning of a string of principal parallel text
 * &mdash; beginning of a string of supplementary parallel text
 * &mdash; beginning of a string of supplementary Japanese phonetic annotation
 * &mdash; beginning of a string of supplementary Chinese phonetic annotation
 * &mdash; end of a string of supplementary phonetic annotations

Ruby markup
In 2001, the W3C published the Ruby Annotation specification for supplementing XHTML with ruby markup. Ruby markup is not a standard part of HTML 4.01 or any of the XHTML 1.0 specifications (XHTML-1.0-Strict, XHTML-1.0-Transitional, and XHTML-1.0-Frameset), but was incorporated into the XHTML 1.1 specification.

Support for ruby markup in web browsers is limited, as XHTML 1.1 is not yet widely implemented. Ruby markup is partially supported by Microsoft Internet Explorer (5.0+) for Windows and Macintosh, but is not supported by Mozilla, Firefox (though see below), Safari/Konqueror or Opera. The WebKit nightly builds have recently added support for Ruby HTML markup.

For these browsers, Ruby support is most easily added by using CSS rules which can be found on the web.

Ruby markup support can also be added to some browsers that support custom extensions. For example, there is an extension which allows Netscape 7, Mozilla, and Firefox to properly render ruby markup under certain circumstances. This extension is freely available for users of these browsers.

Ruby markup is structured such that a fallback rendering, consisting of the ruby characters in parentheses immediately after the main text, will appear if the browser does not have support for ruby.

The W3C is also working on a specific ruby module for the upcoming CSS level 3.

Ruby markup examples
Below are a few examples of ruby markup. The markup is shown first, and the rendered markup is shown next, followed by the unmarked version. Web browsers will either render it with the correct size and positioning as shown in the table-based examples above, or will use the fallback rendering with the ruby characters in parentheses:


 * Markup

東(とう) 京(きょう)</rp>
 * Rendered

東(とう)京(きょう)
 * Unmarked

<ruby style="font-size:1.2em;">北</rb>(</rp>ㄅㄟˇ</rt>)</rp> <ruby style="font-size:1.2em;">京</rb>(</rp>ㄐㄧㄥ</rt>)</rp>

Note that Chinese ruby text would normally be displayed in vertical columns to the right of each character. This approach is not typically supported in browsers at present.

This is a table-based example of vertical columns:

Complex ruby markup
Complex ruby markup makes it possible to associate more than one ruby text with a base text, or parts of ruby text with parts of base text.

It is not supported by most browsers, but there is an extension for Firefox that supports it.