Search for resources by:

Definitions of materials Definitions of levels Advanced Search


Urdu Citations   Urdu Links   Select a New Language

Number of Speakers: 60,503,579 (Gordon 2005

Key Dialects: Standard Urdu, Dakhini Urdu, Rekhta Urdu

Geographical Center: Pakistan

Urdu is spoken by over 10 million people in Pakistan and 48 million people in India. Outside of these countries, Urdu is spoken in Afghanistan, Bahrain, Bangladesh, Botswana, Fiji, Germany, Guyana, Malawi, Nepal, Norway, Oman, Qatar, Saudi Arabia, Thailand, United Arab Emirates, United Kingdom, and Zambia. A fairly substantial Urdu contingent is found in Mauritius, where 64,000 people speak the language, and in South Africa, where 170,000 people speak the language. Although considered a distinct language from Hindi, Urdu and Hindi are mutually intelligible, so much that their grammars are virtually indistinguishable. The key properties distinguishing Urdu and Hindi are their vocabularies and orthographies. The Urdu vocabulary borrows heavily from Arabic and Persian, unlike Hindi, which borrows considerably from Sanskrit. Orthographically, Urdu is written in a modified Perso-Arabic script, while Hindi is written in the Devanagari script.

Urdu is a Central Zone language of the Indo-Aryan branch of the Indo-European language family.

All dialects of Urdu are mutually intelligible. For the most part, the dialects of Urdu are mutually intelligible to speakers of Hindi. Dakhini Urdu has less Persian and Arabic loans than standard Urdu. Rekhta Urdu is strictly used in poetry.

A major difference between Hindi and Urdu concerns the fact that Hindi is written in the Devanagari script, while Urdu is written in a modified form of the Perso-Arabic script. The Urdu orthography consists of 35 graphemes and a number of additional symbols. To the 28-letter Arabic alphabet, seven letters have been added to the Urdu orthography: the four Persian letters p, c, g, and z with a hacek, and three new letters, i.e. th, dh, and a variant of r. As in the Arabic writing system, the Urdu orthography is syllabary. The 35 graphemes are used to write the consonants and a number of consonant diacritics are employed to represent the vowels. The form of most Urdu letters varies depending on whether the letter appears in isolation or in word-initial, medial, or final position. Following Arabic and Persian writing practices, Urdu words are written from right to left, but numerals are written from left to right.

Urdu punctuation differs slightly from punctuation in many Western languages. For example, in place of a period, a dash is used for a full stop. Likewise, an inverted comma is used in place of a comma, an inverted question mark is used in questions, and semicolons are turned 180 degrees. In essence, Urdu punctuation appears as though it were written upside-down.

Urdu and Hindi are considered different languages in a sociocultural sense. However, at a linguistic level, the two are virtually identical. Although there are major differences in orthography and loan vocabulary as previously mentioned, there are minor differences in usage and pronunciation of foreign words. As a result, the grammars of the two languages are virtually indistinguishable, such that apart from written forms of the language, it is not immediately obvious whether Hindi or Urdu is being used.

Urdu has an extensive phoneme inventory consisting of 10 vowels and 33 consonants, depending on the analysis and the dialect under investigation. As is common in Indic languages, a number of retroflex articulations are attested. Aspiration is phonemic/contrastive in the language, but vowel nasalization is not. In other words, aspiration is a dimension that can minimally differentiate words from structurally similar words, whereas the presence or absence of vowel nasality will not distinguish pairs of words in this way. Nasalization, however, serves a morphological role in the language. That is to say, vowel nasalization in Urdu serves much the same grammatical role that prefixes and suffixes do in Hindi and other languages (i.e. they contribute meaningfully to the elements they attach to). Main stress typically falls on the heaviest syllable of the word, that is, the syllable with the most segmental material (for instance, vowel-consonant (VC) syllables are heavier than syllables consisting solely of a single vowel (V) and vowel-consonant-consonant (VCC) syllables are heavier than vowel-consonant (VC) syllables). In the event of a word with multiple equally heavy syllables, main stress falls on the rightmost of these syllables. The syllable structure of the language is (C)(C)V(C)(C). As such, both word-initial and word-final consonant clusters are permitted. Word-initial clusters are restricted to sequences of consonant + semivowel (glide), while word-final clusters are restricted to sequences of nasals and consonants sharing the same place of articulation.

Urdu is a head-final SOV language. Postpositions are attested and affixation is largely suffixal, although some prefixation occurs. Adverbs typically follow the subject and precede the object(s) of the verb. Articles (determiners), adjectives, and relative clauses precede the nouns they modify. Indirect objects precede direct objects and negative/interrogative elements precede the verb. Urdu has a split ergative case system. In the perfective aspect, an ergative case-marking pattern emerges (subjects of transitive verbs are marked differently than subjects of intransitives and direct objects of transitive verbs). In all other tenses, a nominative-accusative pattern is found (subjects of transitive and intransitive verbs are both marked with a case not found on direct objects). (For concrete examples of split-ergativity, see the linguistic sketch of the Hindi profile.) Case-marking is achieved by means of postpositions. Nouns inflect for gender (when animate) and number. Verbs inflect for mood, tense/aspect, number, and person. All inflection proceeds by way of affixation. Urdu verbs agree with their subjects and in some cases their objects, although agreement can be blocked in a number of constructions.

The Urdu lexicon draws heavily on Arabic and Persian vocabulary, unlike the case of Hindi, which draws primarily on Sanskrit.

Urdu is the official language of Pakistan. As such, it is used in government, education, mass communication, trade/commerce, and in everyday communication. For those Pakistanis for whom it is not the mother tongue, Urdu is typically their second or third language. In India, Urdu is the state language of Jammu and Kashmir, where it is used in government and is the medium of instruction in primary schools. Within India, Muslims typically speak Urdu, whereas Hindus typically speak Hindi.

Urdu is regarded as an offshoot of Dakani, a seventeenth century literary language of north India that is also an ancestor of Hindi. The emergence of Urdu as a separate language was due to a number of factors, including increasing Persianization in the early eighteenth century, the writing of Dakani literature in the Perso-Arabic script, and attempts to remove indigenous Hindi elements from the language. Urdu replaced Persian as an official language of British India during the nineteenth century and remains the predominant language of Islamic India today.

Bokhari, Soahail. 1985. Phonology of Urdu Language. Sadar, Karachi: Royal Book Company.

Bright, William, and Saeed A. Khan. 1976. The Urdu Writing System. Ithaca: Spoken Language Series.

Coulmas, Florian. 1996. Writing Systems. Oxford: Blackwell Publishers.

Gambhir, Surendra K. 1983. Spoken Hindi-Urdu: With Emphasis on Intonation in Natural Conversation. Wisconsin: University of Wisconsin Press.

Gordon, Raymond G., Jr. (Editor). 2005. Ethnologue: Languages of the World, Fifteenth Edition. Dallas: SIL International.

Masica, Colin P. 1991. The Indo-Aryan Languages. Cambridge, UK: Cambridge University Press.

Platts, John T. 1967. A Grammar of the Hindustani or Urdu Language. (Reprint) New Delhi: Munshira Manoharlal Publishers.

Schomer, Karine. 1983. Basic Vocabulary for Hindi and Urdu. Lanham, MD: University Press of America.

Return to the list of language portals


 This work is licensed under a Creative Commons License.

  • You may use and modify the material for any non-commercial purpose.
  • You must credit the UCLA Language Materials Project as the source.
  • If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one.

Creative Commons License