Fractionalization Data

Research project

The Fractionalization dataset was compiled by Alberto Alesina and associates, and measures the degree of ethnic, linguistic and religious heterogeneity in various countries. The dataset was used in Alesina et al. (2003) to test the effects of fractionalisation on the quality of institutions and economic growth.


Fractionalization Data

Fractionalization Data


Fractionalization Data




See below


215 countries

Last reviewed


Data types and sources

Indices based on population data collected from Encyclopaedia Britannica (2001), CIA’s World Factbook (2000), Levinson’s Ethnic Groups Worldwide (1998), and Minority Rights Group International’s World Directory of Minorities (1997); in addition to Mozaffar & Scarrit (1999) for selected African cuntries. In most cases the primary source is national censuses.

Data download

Fractionalization Data


The project provides a measurement of ethnic, linguistic, and religious fractionalisation which intends to be more comprehensive than those fractionalization measurements previously used in economics literature, and the new variable-constructs are compared with those previously used. The goal of this new measure of ethnic fragmentation, is a broader classification of groups, taking into account not only language but also racial characteristics (ethnicity) and religion. Based on this they examine the effects of ethnic fragmentation on two general areas: economic growth and the quality of institutions and policy. The indices are computed as one minus the Herfindahl index of group shares. The dataset also contains the underlying data used to construct the indices.

Geographical coverage

The dataset covers 215 countries and territories.

Time coverage and updates

The dataset contains data for only one year for each country. The language and religion indices are based on data from 2001. Most of the data used to compute the ethnic fractionalisation index are from the 1990s, but for some countries older data are used (as far back as 1979).Another freely available dataset containing data on ethnic, religious and linguistic groups is the Ethnic Composition Data, compiled by Tanja Ellingsen. The dataset, used in Ellingsen (2000), relies on similar sources but covers a longer time period. See sources section for link to website.


The dataset is described in Alesina et al. (2003).

Access conditions and cost

Available free of charge.

Access procedures

Predefined table.

Data formats


Comparability and data quality

Defining ethnic, linguistic and religious groups is difficult and is often based on subjective judgement. In many cases it may also be difficult to find reliable data on how many people who belong to the various cultural groups. The underlying data used to construct the fractionalisation indices are therefore likely to be subject to problems of comparability and measurement error. See Alesina et al. (2003), Fearon (2003) and Posner (2004) for discussions of problems associated with various measures of cultural heterogeneity.

Electronic resource

Ethnic Composition Data


Alesina, Alberto, Arnaud Devleeschauwer, William Easterly, Sergio Kurlat, and Romain Wacziarg. 2003. “Fractionalization”. Journal of Economic Growth 8 (June): 155-194.

Ellingsen, Tanja, 2000. “Colorful community or ethnic witches’ brew? Multiethnicity and domestic conflict during and after the cold war”. Journal of Conflict Resolution 44 (April): 228-249.

Fearon, James D. 2003. “Ethnic structure and cultural diversity by country”. Journal of Economic Growth 8 (June): 195-222.

Mozaffar, S., and J. Scarrit. 1999. "The Specification of Ethnic Cleaveges and Ethnopolitical Groups for the Analysis of Democratic Competition in Contemporary Africa", Nationalism and Ethnic Politics 5(1), 82-117.

Posner, Daniel N. 2004. “Measuring ethnic fractionalization in Africa”. American Journal of Political Science 48 (October): 849-863.