Name: | Bamum Letter Phase-A Unknown |
---|---|
Combining Class: | Not Reordered (0) |
Character is Mirrored: | No |
HTML Entity: | 𖡄 𖡄 |
UTF-8 Encoding: | 0xF0 0x96 0xA1 0x84 |
What is Unicode encoding?
Unicode is a computing standard for the consistent encoding symbols. It was created in 1991. It’s just a table, which shows glyphs position to encoding system. Encoding takes symbol from table, and tells font what should be painted. But computer can understand binary code only.
What is encoding in computer?
It’s just a table, which shows glyphs position to encoding system. Encoding takes symbol from table, and tells font what should be painted. But computer can understand binary code only. So, encoding is used number 1 or 0 to represent characters.
Character reference overview
HTML and XML provide ways to reference Unicode characters when the characters themselves either cannot or should not be used. A numeric character reference refers to a character by its Universal Character Set / Unicode code point, and a character entity reference refers to a character by a predefined name.
Latin script
The Unicode Standard (version 14.0) classifies 1,475 characters as belonging to the Latin script.
Phonetic scripts
96 characters; all belong to the Latin script; three in the MES-2 subset. For the rest, see IPA Extensions (Unicode block) .
Brahmic (Indic) scripts
The range from U+0900 to U+0DFF includes Devanagari, Bengali script, Gurmukhi, Gujarati script, Odia alphabet, Tamil script, Telugu script, Kannada script, Malayalam script, and Sinhala script .
Overview
As of Unicode version 14.0, there are 144,697 characters with code points, covering 159 modern and historical scripts, as well as multiple symbol sets. As it is not technically possible to list all of these characters in a single Wikipedia page, this list is limited to a subset of the most important characters for English-language readers, with links to other pages which list the supplementary chara…
Character reference overview
HTML and XML provide ways to reference Unicode characters when the characters themselves either cannot or should not be used. A numeric character reference refers to a character by its Universal Character Set/Unicode code point, and a character entity reference refers to a character by a predefined name.
A numeric character reference uses the format
Control codes
65 characters, including DEL. All belong to the common script.
Footnotes:
Control-C has typically been used as a "break" or "interrupt" key. Control-D has been used to signal "end of file" for text typed in at the terminal on Unix / Linux systems. Windows, DOS, and older minicomputers used Control-Z for this purpose. Control-G is an artifact of the days when teletyp…
65 characters, including DEL. All belong to the common script.
Footnotes:
Control-C has typically been used as a "break" or "interrupt" key. Control-D has been used to signal "end of file" for text typed in at the terminal on Unix / Linux systems. Windows, DOS, and older minicomputers used Control-Z for this purpose. Control-G is an artifact of the days when teletyp…
Combining Marks
• Combining Diacritical Marks (Unicode block)
• Combining Diacritical Marks Extended (Unicode block)
• Combining Half Marks (Unicode block)
• Combining Diacritical Marks Supplement (Unicode block)
Armenian
• Armenian (Unicode block)
Semitic languages
• Arabic script in Unicode, including the Persian alphabet, Jawi alphabet and others
• Unicode and HTML for the Hebrew alphabet
• Mandaic (Unicode block)
• Samaritan (Unicode block)
Thaana
• Thaana (Unicode block)
Brahmic (Indic) scripts
The range from U+0900 to U+0DFF includes Devanagari, Bengali script, Gurmukhi, Gujarati script, Odia alphabet, Tamil script, Telugu script, Kannada script, Malayalam script, and Sinhala script.
• Devanagari in Unicode
• Bengali (Unicode block)
• Gurmukhi (Unicode block)