Jump to content

Arabic script in Unicode

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by UsmanullahPK (talk | contribs) at 23:46, 14 February 2013 (Code blocks). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Template:Persian alphabet As of Unicode 6.1, the Arabic script is contained in the following blocks:

The basic Arabic range encodes the standard letters and diacritics, but does not encode contextual forms (U+0621–U+0652 being directly based on ISO 8859-6); and also includes the most common diacritics and Arabic-Indic digits. The Arabic Supplement range encodes letter variants mostly used for writing African (non-Arabic) languages. The Arabic Extended-A range encodes additional Qur'anic annotations and letter variants used for various non-Arabic languages. The Arabic Presentation Forms-A range encodes contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. The Arabic Presentation Forms-B range encodes spacing forms of Arabic diacritics, and more contextual letter forms. The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text.[3] The Arabic Mathematical Alphabetical Symbols block encodes characters used in Arabic mathematical expressions.

Contextual forms

A demonstration for the basic alphabet used in Modern Standard Arabic:

General
Unicode
Contextual forms Name
Isolated End Middle Beginning
0623
Template:Rtl-lang
FE83
Template:Rtl-lang
FE84
Template:Rtl-lang
Template:Transl
0628
Template:Rtl-lang
FE8F
Template:Rtl-lang
FE90
Template:Rtl-lang
FE92
Template:Rtl-lang
FE91
Template:Rtl-lang
Template:Transl
062A
Template:Rtl-lang
FE95
Template:Rtl-lang
FE96
Template:Rtl-lang
FE98
Template:Rtl-lang
FE97
Template:Rtl-lang
Template:Transl
062B
Template:Rtl-lang
FE99
Template:Rtl-lang
FE9A
Template:Rtl-lang
FE9C
Template:Rtl-lang
FE9B
Template:Rtl-lang
Template:Transl
062C
Template:Rtl-lang
FE9D
Template:Rtl-lang
FE9E
Template:Rtl-lang
FEA0
Template:Rtl-lang
FE9F
Template:Rtl-lang
Template:Transl
062D
Template:Rtl-lang
FEA1
Template:Rtl-lang
FEA2
Template:Rtl-lang
FEA4
Template:Rtl-lang
FEA3
Template:Rtl-lang
Template:Transl
062E
Template:Rtl-lang
FEA5
Template:Rtl-lang
FEA6
Template:Rtl-lang
FEA8
Template:Rtl-lang
FEA7
Template:Rtl-lang
Template:Transl
062F
Template:Rtl-lang
FEA9
Template:Rtl-lang
FEAA
Template:Rtl-lang
Template:Transl
0630
Template:Rtl-lang
FEAB
Template:Rtl-lang
FEAC
Template:Rtl-lang
Template:Transl
0631
Template:Rtl-lang
FEAD
Template:Rtl-lang
FEAE
Template:Rtl-lang
Template:Transl
0632
Template:Rtl-lang
FEAF
Template:Rtl-lang
FEB0
Template:Rtl-lang
Template:Transl
0633
Template:Rtl-lang
FEB1
Template:Rtl-lang
FEB2
Template:Rtl-lang
FEB4
Template:Rtl-lang
FEB3
Template:Rtl-lang
Template:Transl
0634
Template:Rtl-lang
FEB5
Template:Rtl-lang
FEB6
Template:Rtl-lang
FEB8
Template:Rtl-lang
FEB7
Template:Rtl-lang
Template:Transl
0635
Template:Rtl-lang
FEB9
Template:Rtl-lang
FEBA
Template:Rtl-lang
FEBC
Template:Rtl-lang
FEBB
Template:Rtl-lang
Template:Transl
0636
Template:Rtl-lang
FEBD
Template:Rtl-lang
FEBE
Template:Rtl-lang
FEC0
Template:Rtl-lang
FEBF
Template:Rtl-lang
Template:Transl
0637
Template:Rtl-lang
FEC1
Template:Rtl-lang
FEC2
Template:Rtl-lang
FEC4
Template:Rtl-lang
FEC3
Template:Rtl-lang
Template:Transl
0638
Template:Rtl-lang
FEC5
Template:Rtl-lang
FEC6
Template:Rtl-lang
FEC8
Template:Rtl-lang
FEC7
Template:Rtl-lang
Template:Transl
0639
Template:Rtl-lang
FEC9
Template:Rtl-lang
FECA
Template:Rtl-lang
FECC
Template:Rtl-lang
FECB
Template:Rtl-lang
Template:Transl
063A
Template:Rtl-lang
FECD
Template:Rtl-lang
FECE
Template:Rtl-lang
FED0
Template:Rtl-lang
FECF
Template:Rtl-lang
Template:Transl
0641
Template:Rtl-lang
FED1
Template:Rtl-lang
FED2
Template:Rtl-lang
FED4
Template:Rtl-lang
FED3
Template:Rtl-lang
Template:Transl
0642
Template:Rtl-lang
FED5
Template:Rtl-lang
FED6
Template:Rtl-lang
FED8
Template:Rtl-lang
FED7
Template:Rtl-lang
Template:Transl
0643
Template:Rtl-lang
FED9
Template:Rtl-lang
FEDA
Template:Rtl-lang
FEDC
Template:Rtl-lang
FEDB
Template:Rtl-lang
Template:Transl
0644
Template:Rtl-lang
FEDD
Template:Rtl-lang
FEDE
Template:Rtl-lang
FEE0
Template:Rtl-lang
FEDF
Template:Rtl-lang
Template:Transl
0645
Template:Rtl-lang
FEE1
Template:Rtl-lang
FEE2
Template:Rtl-lang
FEE4
Template:Rtl-lang
FEE3
Template:Rtl-lang
Template:Transl
0646
Template:Rtl-lang
FEE5
Template:Rtl-lang
FEE6
Template:Rtl-lang
FEE8
Template:Rtl-lang
FEE7
Template:Rtl-lang
Template:Transl
0647
Template:Rtl-lang
FEE9
Template:Rtl-lang
FEEA
Template:Rtl-lang
FEEC
Template:Rtl-lang
FEEB
Template:Rtl-lang
Template:Transl
0648
Template:Rtl-lang
FEED
Template:Rtl-lang
FEEE
Template:Rtl-lang
Template:Transl
064A
Template:Rtl-lang
FEF1
Template:Rtl-lang
FEF2
Template:Rtl-lang
FEF4
Template:Rtl-lang
FEF3
Template:Rtl-lang
Template:Transl
0622
Template:Rtl-lang
FE81
Template:Rtl-lang
FE82
Template:Rtl-lang
Template:Transl
0629
Template:Rtl-lang
FE93
Template:Rtl-lang
FE94
Template:Rtl-lang
Template:Transl
0649
Template:Rtl-lang
FEEF
Template:Rtl-lang
FEF0
Template:Rtl-lang
Template:Transl

Punctuation and ornaments

Only the Arabic question mark ⟨؟⟩ and the Arabic comma ⟨،⟩ are used in regular Arabic script typing. However, the Arabic comma can also be substituted with the normal comma used in Latin-based scripts at U+002c.

  • U+060C ، ARABIC COMMA
  • U+060D ؍ ARABIC DATE SEPARATOR
  • U+060E ؎ ARABIC POETIC VERSE SIGN
  • U+060F ؏ ARABIC SIGN MISRA
  • U+061F ؟ ARABIC QUESTION MARK
  • U+066D ٭ arabic five pointed star
  • U+06DD ۝ ARABIC END OF AYAH
  • U+06DE ۞ ARABIC START OF RUB EL HIZB
  • U+06E9 ۩ ARABIC PLACE OF SAJDAH
  • U+FD3E arabic ornate left parenthesis
  • U+FD3F ﴿ arabic ornate right parenthesis

Word ligatures

Arabic Presentation Forms-A has a few characters defined as "word ligatures" for terms frequently used in formulaic expressions in Arabic. They are rarely used out of professional liturgical typing, also the Rial grapheme is normally written fully, not by the ligature.

  • U+FDF0 ARABIC LIGATURE SALLA USED AS KORANIC STOP SIGN ISOLATED FORM (صلے)
  • U+FDF1 ARABIC LIGATURE QALA USED AS KORANIC STOP SIGN ISOLATED FORM (قلے)
  • U+FDF2 ARABIC LIGATURE ALLAH ISOLATED FORM (الله)
  • U+FDF3 ARABIC LIGATURE AKBAR ISOLATED FORM (اكبر)
  • U+FDF4 ARABIC LIGATURE MOHAMMAD ISOLATED FORM (محمد)
  • U+FDF5 ARABIC LIGATURE SALAM ISOLATED FORM (صلعم "peace be upon him")
  • U+FDF6 ARABIC LIGATURE RASOUL ISOLATED FORM (رسول)
  • U+FDF7 ARABIC LIGATURE ALAYHE ISOLATED FORM (عليه)
  • U+FDF8 ARABIC LIGATURE WASALLAM ISOLATED FORM (وسلم)
  • U+FDF9 ARABIC LIGATURE SALLA ISOLATED FORM
  • U+FDFA ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM (صلى الله عليه وسلم "peace be upon him")
  • U+FDFB ARABIC LIGATURE JALLAJALALOUHOU (جل جلاله)
  • U+FDFC RIAL SIGN (ريال)
  • U+FDFD ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM (the Basmala)

Code blocks

Note:The National Arabic phonetic alphabetes(NAPA) are taked from the following Code blocks.

Arabic

Arabic[1][2]
Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+060x  ؀   ؁   ؂   ؃   ؄   ؅  ؆ ؇ ؈ ؉ ؊ ؋ ، ؍ ؎ ؏
U+061x ؐ ؑ ؒ ؓ ؔ ؕ ؖ ؗ ؘ ؙ ؚ ؛  ALM  ؝ ؞ ؟
U+062x ؠ ء آ أ ؤ إ ئ ا ب ة ت ث ج ح خ د
U+063x ذ ر ز س ش ص ض ط ظ ع غ ػ ؼ ؽ ؾ ؿ
U+064x ـ ف ق ك ل م ن ه و ى ي ً ٌ ٍ َ ُ
U+065x ِ ّ ْ ٓ ٔ ٕ ٖ ٗ ٘ ٙ ٚ ٛ ٜ ٝ ٞ ٟ
U+066x ٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩ ٪ ٫ ٬ ٭ ٮ ٯ
U+067x ٰ ٱ ٲ ٳ ٴ ٵ ٶ ٷ ٸ ٹ ٺ ٻ ټ ٽ پ ٿ
U+068x ڀ ځ ڂ ڃ ڄ څ چ ڇ ڈ ډ ڊ ڋ ڌ ڍ ڎ ڏ
U+069x ڐ ڑ ڒ ړ ڔ ڕ ږ ڗ ژ ڙ ښ ڛ ڜ ڝ ڞ ڟ
U+06Ax ڠ ڡ ڢ ڣ ڤ ڥ ڦ ڧ ڨ ک ڪ ګ ڬ ڭ ڮ گ
U+06Bx ڰ ڱ ڲ ڳ ڴ ڵ ڶ ڷ ڸ ڹ ں ڻ ڼ ڽ ھ ڿ
U+06Cx ۀ ہ ۂ ۃ ۄ ۅ ۆ ۇ ۈ ۉ ۊ ۋ ی ۍ ێ ۏ
U+06Dx ې ۑ ے ۓ ۔ ە ۖ ۗ ۘ ۙ ۚ ۛ ۜ  ۝  ۞ ۟
U+06Ex ۠ ۡ ۢ ۣ ۤ ۥ ۦ ۧ ۨ ۩ ۪ ۫ ۬ ۭ ۮ ۯ
U+06Fx ۰ ۱ ۲ ۳ ۴ ۵ ۶ ۷ ۸ ۹ ۺ ۻ ۼ ۽ ۾ ۿ
Notes
1.^ As of Unicode version 16.0
2.^ Unicode code point U+0673 is deprecated as of Unicode version 6.0

Arabic Supplement

Arabic Supplement[1]
Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+075x ݐ ݑ ݒ ݓ ݔ ݕ ݖ ݗ ݘ ݙ ݚ ݛ ݜ ݝ ݞ ݟ
U+076x ݠ ݡ ݢ ݣ ݤ ݥ ݦ ݧ ݨ ݩ ݪ ݫ ݬ ݭ ݮ ݯ
U+077x ݰ ݱ ݲ ݳ ݴ ݵ ݶ ݷ ݸ ݹ ݺ ݻ ݼ ݽ ݾ ݿ
Notes
1.^ As of Unicode version 16.0

Arabic Extended-A

Arabic Extended-A[1]
Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+08Ax
U+08Bx
U+08Cx
U+08Dx
U+08Ex  ࣢ 
U+08Fx
Notes
1.^ As of Unicode version 16.0

Arabic Presentation Forms A

They are mostly ligatures which can be created by the previous charts' characters, with the exception of the bracket-like graphemes ﴾ ﴿ and some of them are ligatures of common liturgical phrases.

Arabic Presentation Forms-A[1][2][3]
Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+FB5x
U+FB6x
U+FB7x ﭿ
U+FB8x
U+FB9x
U+FBAx
U+FBBx ﮿
U+FBCx
U+FBDx
U+FBEx
U+FBFx ﯿ
U+FC0x
U+FC1x
U+FC2x
U+FC3x ﰿ
U+FC4x
U+FC5x
U+FC6x
U+FC7x ﱿ
U+FC8x
U+FC9x
U+FCAx
U+FCBx ﲿ
U+FCCx
U+FCDx
U+FCEx
U+FCFx ﳿ
U+FD0x
U+FD1x
U+FD2x
U+FD3x ﴿
U+FD4x
U+FD5x
U+FD6x
U+FD7x ﵿ
U+FD8x
U+FD9x
U+FDAx
U+FDBx ﶿ
U+FDCx
U+FDDx
U+FDEx
U+FDFx ﷿
Notes
1.^ As of Unicode version 16.0
2.^ Grey areas indicate non-assigned code points
3.^ Black areas indicate noncharacters (code points that are guaranteed never to be assigned as encoded characters in the Unicode Standard)

Arabic Presentation Forms B

They can all be created by the basic chart's characters.

Arabic Presentation Forms-B[1][2]
Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+FE7x ﹿ
U+FE8x
U+FE9x
U+FEAx
U+FEBx ﺿ
U+FECx
U+FEDx
U+FEEx
U+FEFx ZW
NBSP
Notes
1.^ As of Unicode version 16.0
2.^ Grey areas indicate non-assigned code points

Rumi Numeral Symbols

Rumi Numeral Symbols[1][2]
Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+10E6x 𐹠 𐹡 𐹢 𐹣 𐹤 𐹥 𐹦 𐹧 𐹨 𐹩 𐹪 𐹫 𐹬 𐹭 𐹮 𐹯
U+10E7x 𐹰 𐹱 𐹲 𐹳 𐹴 𐹵 𐹶 𐹷 𐹸 𐹹 𐹺 𐹻 𐹼 𐹽 𐹾
Notes
1.^ As of Unicode version 16.0
2.^ Grey area indicates non-assigned code point

Arabic Mathematical Alphabetic Symbols

Arabic Mathematical Alphabetic Symbols[1][2]
Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+1EE0x 𞸀 𞸁 𞸂 𞸃 𞸅 𞸆 𞸇 𞸈 𞸉 𞸊 𞸋 𞸌 𞸍 𞸎 𞸏
U+1EE1x 𞸐 𞸑 𞸒 𞸓 𞸔 𞸕 𞸖 𞸗 𞸘 𞸙 𞸚 𞸛 𞸜 𞸝 𞸞 𞸟
U+1EE2x 𞸡 𞸢 𞸤 𞸧 𞸩 𞸪 𞸫 𞸬 𞸭 𞸮 𞸯
U+1EE3x 𞸰 𞸱 𞸲 𞸴 𞸵 𞸶 𞸷 𞸹 𞸻
U+1EE4x 𞹂 𞹇 𞹉 𞹋 𞹍 𞹎 𞹏
U+1EE5x 𞹑 𞹒 𞹔 𞹗 𞹙 𞹛 𞹝 𞹟
U+1EE6x 𞹡 𞹢 𞹤 𞹧 𞹨 𞹩 𞹪 𞹬 𞹭 𞹮 𞹯
U+1EE7x 𞹰 𞹱 𞹲 𞹴 𞹵 𞹶 𞹷 𞹹 𞹺 𞹻 𞹼 𞹾
U+1EE8x 𞺀 𞺁 𞺂 𞺃 𞺄 𞺅 𞺆 𞺇 𞺈 𞺉 𞺋 𞺌 𞺍 𞺎 𞺏
U+1EE9x 𞺐 𞺑 𞺒 𞺓 𞺔 𞺕 𞺖 𞺗 𞺘 𞺙 𞺚 𞺛
U+1EEAx 𞺡 𞺢 𞺣 𞺥 𞺦 𞺧 𞺨 𞺩 𞺫 𞺬 𞺭 𞺮 𞺯
U+1EEBx 𞺰 𞺱 𞺲 𞺳 𞺴 𞺵 𞺶 𞺷 𞺸 𞺹 𞺺 𞺻
U+1EECx
U+1EEDx
U+1EEEx
U+1EEFx 𞻰 𞻱
Notes
1.^ As of Unicode version 16.0
2.^ Grey areas indicate non-assigned code points

References

  1. ^ Unicode v6.1 (UAX#41): Scripts
  2. ^ "Arabic Mathematical Alphabetic Symbols" (PDF). 2012-02-01.
  3. ^ The Unicode Consortium. The Unicode Standard, Version 6.0.0, (Mountain View, CA: The Unicode Consortium, 2011. ISBN 978-1-936213-01-6), Chapter 8
  • Oibane. "Unicode problems". Arabic on Linux. Archived from the original on 2008-12-07. {{cite web}}: |archive-date= / |archive-url= timestamp mismatch; 2008-02-03 suggested (help); Unknown parameter |deadurl= ignored (|url-status= suggested) (help)