Arabic Script Beyond Urdu: Persian, Pashto, and Sindhi

Linguistics • 7 min read

Urdu is far from the only language to adopt the Arabic script for its own purposes. Persian, Pashto, Sindhi, Kurdish, and several other languages across a wide geographic belt from the eastern Mediterranean to South Asia all write using Arabic-derived alphabets, each adapted with its own additional letters to capture sounds the original Arabic script didn't need to represent. Understanding these adaptations clarifies a lot about why Urdu looks the way it does, and helps avoid a common assumption that "Arabic script" means one single, uniform system.

Persian: The Closest Relative

Persian (Farsi) adopted the Arabic script centuries before Urdu existed as a distinct language, and Urdu in turn borrowed enormous amounts of vocabulary, poetic convention, and even its preferred Nastaliq calligraphy style directly from the Persian literary tradition. Persian added four extra letters beyond the standard Arabic alphabet: پ (pe), چ (che), ژ (zhe), and گ (gaf). All four are letters Urdu also uses, having inherited them through this Persian literary influence rather than developing them independently. If you compare a page of Persian poetry to a page of Urdu poetry in Nastaliq, the resemblance is immediately obvious; the major visible difference for a casual observer is usually vocabulary and a handful of additional Urdu-specific letters that Persian doesn't have at all.

Advertisement

Where Urdu Diverges From Persian

Urdu's extra letters beyond what it shares with Persian, namely ٹ (tte), ڈ (ddal), ڑ (rre), and ں (noon ghunna), exist specifically to represent retroflex consonants and nasalization that are common in the Indo-Aryan languages of South Asia but absent from Persian or Arabic. These sounds come from Urdu's deep roots in Hindi-Urdu's shared Indo-Aryan grammar and phonology, even though Urdu's script and a large portion of its formal vocabulary come from Persian and Arabic. This dual heritage, Indo-Aryan grammar and sound system paired with Perso-Arabic script and literary vocabulary, is part of what makes Urdu's linguistic identity genuinely distinct from both its script-donor languages and its grammatical relatives.

Pashto: A Heavier Modification

Pashto, spoken across Afghanistan and northwestern Pakistan, modifies the Arabic script more extensively than Urdu does. Beyond adopting the Persian additions, Pashto introduces its own further letters to represent retroflex and other sounds specific to the language, including distinct characters not found in Urdu's alphabet at all. Because Pashto's sound system differs more from Arabic and Persian than Urdu's does, its written form required correspondingly more invention to capture accurately, resulting in an alphabet that, while clearly related to Urdu's, isn't simply a subset or superset of it.

Sindhi: The Most Extensively Adapted

Sindhi, spoken primarily in the Sindh province of Pakistan, represents perhaps the most heavily modified Arabic-derived script among major South Asian languages. Sindhi includes a set of implosive consonants, sounds produced with a distinctive intake of air, that needed entirely new letterforms not borrowed from Persian, Pashto, or any other established Arabic-script tradition. Sindhi's alphabet, as a result, contains noticeably more letters than Urdu's, reflecting a richer consonant inventory that the base Arabic script was never designed to capture.

A Practical Takeaway for Designers and Developers

If you're building something that needs to support multiple Arabic-script languages, not just Urdu, it's worth explicitly testing each language's extra characters rather than assuming "Arabic script support" in a font or input system automatically covers all of them. A font that handles Urdu beautifully may still be missing Sindhi's implosive consonant letters, or may render Pashto's specific additions incorrectly. Each of these languages, despite sharing a common script ancestor, has its own complete and specific character requirements.

Advertisement

Kurdish and Other Lesser-Known Adaptations

Beyond Persian, Pashto, and Sindhi, several other languages have adapted Arabic script with their own modifications, each reflecting the specific sound inventory of the language in question. Kurdish, depending on the dialect and region, has historically been written in both Arabic-derived and Latin-derived scripts, with the Arabic-based Sorani Kurdish alphabet adding distinct vowel letters that function differently from how vowels are typically handled in Arabic, since Arabic relies heavily on diacritical marks for vowels while Sorani Kurdish writes most vowels as full letters within the word itself. Old Malay (Jawi script), Uyghur, and several Central Asian languages historically used Arabic-derived scripts as well, though many of these have since shifted partially or fully to Latin or Cyrillic-based writing systems for various historical and political reasons over the twentieth century. This broader pattern, languages adapting a shared script ancestor with locally specific modifications, mirrors almost exactly what happened with the Latin alphabet across European languages, where French, German, Polish, and Vietnamese all added their own diacritics and extra letters to the same base 26-letter foundation.

Why This History Matters for Modern Font Development

Type designers building a genuinely comprehensive Arabic-script font face a real engineering decision: how many of these regional letter sets to support, and how to prioritize limited development time. A font claiming broad "Arabic script support" might cover the core Arabic alphabet plus Persian's four additions, since Persian has a large speaker base and significant publishing demand, while still lacking Sindhi's implosive consonants or certain Kurdish vowel letters, simply because supporting every regional variant comprehensively is a substantial undertaking. This is precisely why, when evaluating a font for a specific project, checking the actual glyph coverage for your target language's complete alphabet matters more than trusting a general "supports Arabic script" label, which can mean meaningfully different things depending on which of these regional traditions the font's creator had in mind during development.

Explore Urdu's Own Letters

If you want to see exactly which letters Urdu added on top of the base Arabic alphabet, and which it shares with Persian, our Urdu Alphabet Explorer lists every letter individually with notes on which sounds are specific to Urdu and South Asian pronunciation.

Related Articles