The profiles for Arabic are treated differently than most other languages on this web site because of the intertwining relation between the different varieties of Arabic, including Modern Standard Arabic, the various regional and local dialects, and Classical Arabic, all of which play a role in countries where Arabic is spoken. This overview will discuss traits shared by all these varieties of Arabic, such as their history and writing system, as well as questions which concern more than one variety, such as how Modern Standard Arabic and a spoken dialect are used in different situations in the same country, and the degree to which the different dialects are interintelligible.

In addition to this overview, individual profiles are available for the following varieties of Arabic:

Rather than repeating and modifying the information given in this overview, these profiles give information specific to the particular variety of Arabic addressed. The Modern Standard Arabic profile describes the variety of Arabic used in most written media and in very formal speaking contexts such as news broadcasts, sermons, and lectures. As for the local varieties of Arabic, it is impractical to provide a separate profile for each of the many countries where Arabic is spoken. Instead, representative and aggregate profiles are provided which cover the scope of the Arabic-speaking region. The Egyptian Arabic and Moroccan Arabic profiles are thus country-specific profiles intended to be representative of the dialects of Arabic spoken on the African continent. In contrast, the Levantine Arabic profile is aggregate,describing a group of closely related dialects spoken in several countries.

Arabic is the official language of some fifteen countries, and over 200,000,000 people are estimated to speak some dialect of Arabic. The geographical center of the language can be said to encompass the northermost part of Africa from Mauritania to Egypt, the Levant, the Arabian Peninsula, and Iraq. In addition to the Arab countries, in which Arabic speakers are concentrated, large numbers of Arabic speakers live in Iran and France (600,000 speakers each), while a substantial number of speakers live in the Comoroes, Tanzania, and other parts of Africa.


Arabic is a Semitic language of the Arabo-Canaanite subgroup (Ruhlen 1987). Arabic and Canaanite—which includes Hebrew, Phoenician, and several extinct languages—are distantly related to Aramaic. Other even more distant relatives are the Semitic languages of Ethiopia and Eritrea (such as Amharic and Tigrinya) and Akkadian, an extinct language once spoken in Mesopotamia. Semitic is a branch of the Afro-Asiatic family of languages, the bulk of which are spoken in Africa. Afro-Asiatic has several major branches: Semitic, Berber, Chadic (including languages such as Hausa), Cushitic (including languages such as Somali), and Ancient Egyptian, whose modern descendent, Coptic, is preserved as a liturgical language. It should be noted that the minority languages collectively known as South Arabian spoken by about 50,000 people altogether in Oman and Yemen are more closely related to the Semitic languages of Ethiopia and are not dialects of Arabic.

Arabic itself is commonly subclassified as Classical Arabic, Eastern Arabic, Western Arabic, and Maltese. (Maltese is treated in a separate profile on this site.) Modern Standard Arabic (MSA) is a modernized form of Classical Arabic.

Western Arabic encompasses the Arabic spoken colloquially in the region of northern Africa—often referred to as the Maghrib—from Morocco to western Libya and in adjacent African countries to the immediate south. Our profile on Moroccan Arabic is representative of these dialects.

Eastern Arabic, sometimes referred to as Mesopotamian Arabic, includes the Arabic dialects spoken in a large region encompassing North Africa (Egypt and Sudan), the Middle East (Syria, Iraq, and the Arabian Peninsula), and Arabic speaking communities in Asia (Bateson 1967). The variety of Eastern Arabic profiled on our site is Egyptian.


The system used to write Arabic is called Arabic script. (The term "Naskhi script" is also sometimes used.) It is a cursive, consonantal script, written horizontally from right to left, with 28 symbols for consonants. A letter may have up to four different forms: independent (non-connecting), connecting only to the left (initial), connecting only to the right (final), and connecting to both sides (medial or internal). This is illustrated here for the letters kaaf, baa', and 'ayn:

examples of positional forms of the Arabic letters

There are also several diacritics to indicate vowels, gemination (doubling of a consonant), and other devices. A text written with the all the appropriate diacritics said to be "vocalized", and one without them is "unvocalized". (Sometimes the terms "vowelled/unvowelled" and "pointed/unpointed" are also used.) Here is an example of a sentence written in vocalized and unvocalized form:

example sentence written in vocalized and unvocalized form

Except in the Quran and in children's books, Arabic is normally written in unvocalized form, with diacritics written only in the few cases where a serious ambiguity arises. Long vowels are normally indicated by the presence of a letter, while short vowels are represented solely by diacritics. Therefore, in unvocalized form, the reader must ascertain which short vowels occur in the word using his own knowledge of the language. The same is true for certain other features, such as gemination and the indefinite suffix -n.

Arabic script (Naskhī) derives from Nabatean script, which in turn derives from Aramaic. Aramaic script goes back to Phoenician, the script from which the Greek script has also developed. The earliest Nabatean inscriptions date back to the 2nd century BC to the 2nd century AD. Naskhī first appeared in the 11th century AD, and has been used ever since. The earliest texts of Quran were written in Kufic script, which also belongs to the Nabatean group of scripts (for more information see Bakalla 1994, Belova 1998b, Campbell 2000, Fischer 2002, Holes 1994, Suleiman 1994).

Several other, unrelated languages use Arabic script including Persian, Pushto, and Urdu. Addtionally, several Turkic and African languages used Arabic script in the past, before the adoption of Latin or Cyrillic scripts. These include Turkish, Swahili, and Hausa.

Modern Standard Arabic (like Classical Arabic) has 27 simple consonants, the 3 short vowels /a,i,u/, and the 3 long vowels /aa,ii,uu/. It also has a number of velar and post-velar consonants, including two pharyngeal fricatives (one voiced and one voiceless) and a voiceless uvular stop. The consonants /t,d,s,dh/ have two variants, one normal and one "emphatic" (glottalized or pharyngealized). Emphatic consonants are usually transliterated with a dot underneath, but in these pages they will be written with an upper case letter. No more than two consonants can ever be adjacent in Standard Arabic, and no syllable can start with two syllables. The modern dialects can differ from the sound pattern of Standard Arabic in various ways, such as a larger inventory of vowel sounds, a loss of length contrast in vowels, a different number of consonants with emphatic variants, and a relaxation of Standard Arabic's restrictions on consonant clusters. Stress in Arabic is predictable, but the precise rules for determining stress varies according to variety.

As in other Semitic languages, the bulk of the vocabulary consists of words formed by the application of templates (vowel patterns and affixes) to triliteral (3-consonant) roots. For example, from the triliteral root k-t-b are formed a variety of Standard Arabic words related to the concept of writing: kitaab "book", maktaba(t) "library", maktab "desk, office", kaatib "writer", kataba "he wrote". Quadriliteral (4-consonant) roots also exist, but are less numerous.

Nouns are either masculine or feminine, and either singular, dual, or plural. In Standard Arabic, these gender and number systems combine in the agreement system, giving distinct agreeing verb and adjective forms for, say, a feminine dual noun and a feminine plural noun. The modern dialects vary in the degree to which this system is simplified. For example, in Egyptian Arabic, there are distinct dual forms for nouns, but there are no corresponding dual forms of verbs or adjectives, and plural forms are used instead. Standard Arabic has a system of three grammatical cases: nominative, accusative, and genitive. In contrast, none of the modern dialects have a system of morphological case, and typically only the most competent speakers master the rules governing their use in Standard Arabic.

Arabic plurals are divided into "sound plurals" (regular plurals) and "broken plurals" (irregular plurals). Sound plurals use a special suffix, whereas the broken plurals are formed according to several different patterns or templates, such as kalb "dog" > kilaab "dogs", kitaab "book" > kutub "books", baab "door" > 'abwaab "doors".

All varieties of Arabic have a definite article, which is 'al- in Standard Arabic. In a definite noun phrase including an adjective, the article appears on both the noun and the adjective. Using the Standard Arabic noun mudarrisuuna "teachers (masc. pl.)" and lubnaaniyyuuna "Lebanese (masc. pl.)" compare the indefinite noun phrase mudarrisuuna lubnaaniyyuuna "Lebanese teachers" with its definite equivalent 'al-mudarrisuuna l-lubnaaniyyuuna "the Lebanese teachers". This phenonenon is often termed "definiteness spread".

Possession and similar concepts are often expressed by a construct state, which is a sequence of unmodified nouns, such as kitaabu s-sulTaani "the sultan's book". Only the last noun in a construct can have the definite article, and it is the last noun in the construct which determine's whether the whole phrase is definite or not. Thus, the indefinite construct kitaabu binti sulTaanin "a sultan's daughter's book" is made definite by adding the definite article to the word meaning "sultan": kitaabu binti s-sulTaani "the sultan's daughter's book". While the construct is the primary means of expressing possession in Classical and Standard Arabic, in many of the modern dialects the construct has given way to prepositional phrases in many contexts.

A verb stem is derived from a consonantal root by using a verbal template known as a "form" or "measure". Each of these templates is associated with a range of meanings. In Western scholarship on Arabic, these templates are usually referred to by a Roman numeral. For example, the perfective of Form VI has the template taCaaCaC, and verbs of this form often have the meaning of a reciprocal action, such as takaatab- "to write to each other". Similarly, verbs of Form II have the perfective template CaCCaC and are often causative or intensive in meaning. Modern Standard Arabic has ten commonly used forms. The modern dialects generally preserve the bulk of these forms, but may lack some of them and also have forms which do not directly correspond to any of the MSA forms.

Standard Arabic and the modern dialects use different strategies to form the passive of a verb. Standard Arabic verbs form their passives by changing the vowel pattern inside the verb stem, as in dafana "he buried" > dufina "he was buried". The modern dialects use a form which employs a passive prefix. For example, in Egyptian Arabic, dafan "he buried" > iddafan "he was buried".

The verb has a perfective conjugation to denote completed events, and an imperfective conjugation to denote uncompleted actions. Particles can be added to these forms to create a wider range of tenses. For example, the Standard Arabic imperfective verb form yaktubu "he writes" can be preceded by the particle sawfa to express the future tense, as in sawfa yaktubu "he will write". The varieties of Arabic differ in their use of particles. The imperfective conjugation of Standard Arabic has a system of moods (indicative, subjunctive, and jussive) not found in the modern dialects. Verbs also have active and passive participals and an imperative form. However, there is no infinitive.

The varieties of Arabic differ in terms of borrowing. Being closely associated with Classical Arabic, Standard Arabic resists heavy borrowing from other languages, and novel words are often coined using Arabic roots. Borrowing in the modern dialects varies depending on historical and ongoing contact with other languages. The dialects of the Maghrib borrow heavily from French, while new borrowings in most other countries are usually from English. The borrowings can also reflect an earlier socio-linguistic context, such as the many borrowings in Egyptian Arabic from Coptic, a language which died out in the 17th century.

There are many words of Arabic origin in English, such as algebra (< Arabic al-jabr), alcohol (< al-kuHl), and coffee (< qahwa(t)). Due to the importance of Arabic in the Islamic world, many languages have borrowed much of their literary and specialized vocabulary from Arabic, in much the same way that English borrowed from French and Latin. For example, the Arabic word jumhuuriyya(t) has been borrowed as the word for "republic" into Swahili as jamhuri, into Urdu as jumhuuriah, into Turkish as cumhuriyet, and into Person as jomhuri.

Arabic was originally the language of the nomadic tribes of the northern and central regions of the Arabian Peninsula. It was only during the Muslim conquest and expansion of the seventh and eighth centuries that Arabic spread into the areas where it is now spoken. In the process, it largely supplanted the indigenous languages of the conquered regions, including Aramaic in the Levantine, Coptic in Egypt, Berber in North Africa, and Greek in the former Byzantine Empire.

In written form, some early inscriptions exist. Arabic of the pre-Classical period is found in inscriptions of central and northwestern Arabia, with Classical Arabic itself appearing in inscriptions dating from at least the fourth century. Pre-Islamic poetry, the Koran from the first half of the seventh century, and the language of contemporary Bedouin provided the basis for the codification of the language during the eighth and ninth centuries. Modern Standard Arabic, the official language of all Arab countries, is modeled on Classical Arabic, which exerts a continuing influence on the form and style of its modern variant.

The linguistic development of the vernacular forms of Arabic are controversial, but one theory which has a lot of support, argues that the colloquial dialects grew out of a wide-spread koine or heavily dialectally mixed lingua franca, which was used during the Muslim conquest; subsequent regional differences are explained by specific geographical and regional indigenous influences and normal change over time (Bateson 1967:94-97).

