A Guide to Using Myanmar Unicode

Web Development

Background

Many existing Myanmar websites use images to represent Myanmar text because of the problems with fonts that are pretending to be English. A few sites use a specific font, but if the user does not have a font of exactly that name installed the text appears as an apparently random sequence of English (i.e. Latin) characters.

One of the key benefits of Unicode is that the script can be identified from the numbers used to represent this character. In theory, the browser should therefore be able to use an alternative font to display the text correctly, even if it can't find a font of precisely the right name.

Unfortunately some browsers are unable to display Myanmar Unicode correctly. As a result some web developers are using pseudo-Unicode fonts which use some Unicode code points mixed with non-standard ones. The problem with this is it makes it harder to adopt real unicode. The text will display completely wrong in a Unicode enabled browser if another Unicode font is installed. If a pseudo-Unicode font is installed, then it may prevent Unicode compliant pages displaying correctly.

This page outlines how web pages can be designed to use Myanmar Unicode, whilst still allowing for non-compliant browsers.

Detecting Myanmar Unicode Support

There is no point in writing a Myanmar Unicode web site if the user can't read it. It is therefore useful to be able to detect whether a user has a Myanmar Unicode enabled Web Browser. However, this is fairly easy to do.

A simple detection algorithm can be written by comparing the width of ကက (U+1000 U+1000) with the width of က္က (U+1000 U+1039 U+1000). A Myanmar Unicode enabled browser renders U+1000 U+1039 U+1000 with the 2 consonants stacked on top of each other, so it should therefore have half the width of U+1000 U+1000. A non-compliant browser will not interpret the U+1039 correctly and so renders it as 2 consonants of twice the width. If a Myanmar font is not installed, it will actually have 3 times the width, with blank squares. Obviously, there are lots of possible variations for this test.

If the user does not have Unicode Support enabled, you can switch to images, embedded SVG (Scalable Vector Graphics), VML (Microsoft's Vector Markup Language) or Canvas objects. The myUnicode detection algorithm used on this site automatically switches to Canvas objects in Firefox and Opera, since that can be dynamically scaled and coloured. On IE is uses VML. It uses a predefined list of Myanmar syllables and displays one image per syllable. This relies on having a complete syllable list, which is not easy to obtain especially for rare stacked characters and foreign words. This has several advantages over the old method of creating an image for every line or paragraph:

The Canvas solution is much neater than images, though it still requires knowledge of which sequences need complex rendering. Unfortunately IE does not yet have SVG support or Canvas support so VML is used instead. The Canvas algorithm used here relies on data generated using the grsvg program.. The embedded Canvas or VML objects are inserted dynamically using javascript in place of the orginal Unicode text. The characters are drawn using path elements using precached positions for each Unicode sequence.

This javascript files are generated from a TTF font as follows:

grsvg --svg-font Padauk.ttf
xsltproc svgFont2json.xsl Padauk.svg > Padauk.js
grsvg Padauk.ttf -i syllables.txt -j PadaukRendered.js

Virtual Keyboard

Many websites have forms which require the user to type input. However, how can the user type Myanmar Unicode if they don't have a suitable input method installed? One method is to use a JavaScript keyboard. This can be used, even without Myanmar Unicode support by converting to images.


Click on some characters to see their codes.

Examples

You can download this code as examples to use on your own website. The latest version is in Mercurial:

hg clone http://thanlwinsoft.co.uk/cgi-bin/hgwebdir.cgi/myWebDevelopment/

Detecting Support>>