Posted on 3rd December 2009No Responses
The Unicode Character Set


The Unicode is a 16-bit character set. This character set is used in almost all the programming languages. Not only this, it offers a wide variety of characters, some several millions of them, along with the normal ASCII character set. These characters are used to represent the international character set and also contain the characters used in the Asian languages.

The Unicode character sets are not only used in latest programming languages like Java, but also in languages using scientific symbols and even the primitive languages that are no longer used.

In the year 1993, the consortium of companies such as Apple, Microsoft, HP, Digital and IBM created the Unicode character set, using the ISO-10646 standard. Their aim was to produce a single standard. Not only this, this character set is also used in the Windows NT operating system.

All the characters used in the 16-bit Unicode character set occupy the same space. This character set shares its first 256 values with the ISO-Latin character set, which forms the basis of the earlier operating systems such as Windows 3.1 and Windows 95.

In addition to the characters of the ASCII character set, the Unicode character set defines an additional 34,168 distinct coded characters. This character set uses a single instance for each character set. Not only this, Unicode also assigns it a unique name and a code value. Unicode characters also combine with the accent characters defining the base characters that needs to be modified.

Like other programming languages, Java programs also use characters from the Unicode character set, using different versions of Unicode for this purpose. For example, Java 1.0 utilizes the character set from Unicode version 1.1 while Java 1.1 onwards supports Unicode version 2.0. Also, not all the web browsers display all the characters of Unicode character set. For example the MathML character set of Unicode is not supported by Internet Explorer 6.0 or above.

In the Unicode character set, all the characters are arranged in the alphabetical order of corresponding entity reference names. You can use a character directly or with an entity reference code. For example, you can directly copy and paste the characters like “β. Otherwise, you can also refer it with its entity reference as “β”

Comments
Leave a Response
XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>