Sunday, March 18, 2018

Bilateral trade data for 1850-1900

Compiled by Pascali (2017) from primary sources. See its section II.C for detail and alternative datasets.

Tuesday, March 13, 2018

Monday, January 29, 2018

Cross-country "years of education" datasets

Penn World Tables 9.0

See this document, which reviews the academic debate on the quality of Barro-Lee dataset.

Cohen and Soto (2007)
provide an alternative data to Barro and Lee (see below)

Barro-Lee dataset

A well-known dataset on average years of schooling (i.e. stock of human capital) by 5-year age group for 146 countries from 1950 to 2010. See Barro and Lee (2013) for detail. To download the data, visit For data sources, see Appendix Notes.

For details on the data construction, read Robert J. Barro and Jong-Wha Lee, "International Data on Educational Attainment: Updates and Implications" (CID Working Paper No. 42, April 2000). This 2000 paper is an updated version of Barro and Lee (1993). Both papers compare various measures of human capital.

The average years of schooling is available for the six sets of the population: male over 25, female over 25, all over 25, male over 15, female over 15, all over 15.

Population over the age of 15 "corresponds better to the labor force for many developing countries." (Barro and Lee 2000, p.2)

Percentages of those who attained/completed each level of school in the total/male/female population are also available. Note that the sum of variables LU, LP, LS, and LH is 100; Lx-LxC, where x is either P, S, or H, is the percentage of those dropping out before completing primary, secondary, or higher school, respectively. In other words, the percentage of ".... school attained" contains the percentage of "... school complete".

Downloadable at this page by Center for International Development at Harvard University (CID).

The data file in the panel dataset format is best avoided because it excludes countries not in Penn World Table 5.0 (e.g. former socialist countries).

Note that variable SHCODE (numerical country code in Penn World Table 5.0) is different from the one in Penn World Table 5.6.

A very minor point, but the data entries for USSR/Russia in 1990 seem unreliable. Population seems to refer to USSR while educational attainment figures seem to refer to Russia.

Papers using this dataset include Acemoglu et al. (2005) and Glaeser et al. (2007).

For other datasets on average schooling years, see Kyriacou (1991), which is used by Benhabib and Spiegel (1994, JME), and Nehru et al. (1995), which is used by Pritchett (2000).

See Krueger and Lindahl (2001, JEL) for critical reviews on average schooling year data.

Cross-country school enrollment ratio data

Lee and Lee (2016) compile historical school enrollment ratios by gender since 1820 for 111 countries.
  • Downloadable here
  • This is an updated version of the dataset compiled by Barro, Robert J. and Jong-Wha Lee (2015) Education Matters: Global Schooling Gains from the 19th to the 21st Century (Oxford University Press) 

Aaron Benavot and Phyllis Riddle (1988) compiled cross-country data on the primary school enrollment ratio in the late 19th century and the early 20th century.

Real GDP per capita

World Development Indicators (WDI) - in current/constant local currency unit and in current/constant US dollars since 1960


Penn World Table (PWT) - in purchasing power parity since 1950

See here for my rough summary of data construction.

See Nuxoll (1994) for the validity of using economic growth rates from Penn World Table.

See also Feenstra et al. (2004)

For version 5.6, there is an augmented version constructed by Fearon and Laitin (2003). Which is used by Miguel et al. (2004), hence contained in their dataset.

Comparison of WDI vs PWT

Discussing PWT version 6, Johnson et al. (2013) argue that while PWT is good at cross-country comparison, economic growth is better measured by WDI. See also Ciccone and Jarocinski (2010).

See Pinkovskiy and Sala-i-Martin's working paper "Newer Need Not Be Better: Evaluating the Penn World Tables and the World Development Indicators Using Nighttime Lights" for how much PWT versions 7 and 8 do any better.


Angus Maddison (2003) The World Economy: Historical Statistics (Paris: OECD)

Annual data entries, wherever possible, from 1820 until 2001.

Data for 1500, 1600, and 1700 is also available, used by Acemoglu, Johnson, and Robinson (2005)'s "The Rise of Europe" paper.

Downloadable from the book's website (you need username and password written at the end of Table of Contents in the book)

Used by Acemoglu and Johnson (2006) for their analysis on the effect of life expectancy on economic growth between 1940 and 1980.

Used also by Persson and Tabellini (2006).

For the latest data, see Bolt, Jutta, and Jan Luiten van Zanden, “The Maddison Project: Collaborative Research on Historical National Accounts,” Economic History Review, 67 (2014), 627–651.


Barro-Ursua Macroeconomic Data

An attempt to correct Maddison's data. Used by Barro and Ursua "Rare Macroeconomic Disasters" and Barro "Convergence and Modernization Revisited".

Downloadable from Robert Barro's website.

Historical global freight cost

Mohammed, S. I. S., and J. G. Williamson, “Freight Rates and Productivity Gains
in British Tramp Shipping 1869–1950,” Explorations in Economic History, 41
(2004), 172–203

Used by Henderson et al. 2018 (Figure IV)

Historical cross-country literacy rates

Roser, Max, and Esteban Ortiz-Ospina, “Literacy,” Our World In Data, 2016.

Van Zanden et al. (2014), pp. 89-90, note that the UNESCO report on the Progress of Literacy in Various Countries Since 1900 porivdes 173 observations for 30 countries for the period before 1950.