Monday, December 17, 2018

Botswana 1946 census

Bechuanaland Population and Housing Census of 1946

Acemoglu and Robinson (2012), p. 412, cite it as the last census of Botswana asking questions about ethnicity. "In the Ngwato reserve, for example, only 20 percent of the population identified themselves as pure Ngwato; though there were other Tswana tribes present, there were also many non-Tswana groups whose first language was not Setswana."

Monday, November 12, 2018

Doing Business surveys

Annual cross-country data on regulations, conducted by the World Bank, since 2004. As Djankov (2016) explains, it originated in academic papers written by Andrei Shleifer and his coauthors.

The data is available for free at the World Bank's website.

Besley (2015) discusses pros and cons of this dataset, including his own finding that the correlation between the Doing Business data and firm survey data is not always as expected (Table 2).

Monday, November 5, 2018

Real GDP per capita

World Development Indicators (WDI) - in current/constant local currency unit and in current/constant US dollars since 1960


Penn World Table (PWT) - in purchasing power parity since 1950

See here for my rough summary of data construction.

See Nuxoll (1994) for the validity of using economic growth rates from Penn World Table.

See also Feenstra et al. (2004)

For version 5.6, there is an augmented version constructed by Fearon and Laitin (2003). Which is used by Miguel et al. (2004), hence contained in their dataset.

Comparison of WDI vs PWT

Discussing PWT version 6, Johnson et al. (2013) argue that while PWT is good at cross-country comparison, economic growth is better measured by WDI. See also Ciccone and Jarocinski (2010).

See Pinkovskiy and Sala-i-Martin's working paper "Newer Need Not Be Better: Evaluating the Penn World Tables and the World Development Indicators Using Nighttime Lights" for how much PWT versions 7 and 8 do any better.


Angus Maddison (2003) The World Economy: Historical Statistics (Paris: OECD)

Annual data entries, wherever possible, from 1820 until 2001.

Data for 1500, 1600, and 1700 is also available, used by Acemoglu, Johnson, and Robinson (2005)'s "The Rise of Europe" paper.

Downloadable from the book's website (you need username and password written at the end of Table of Contents in the book)

Used by Acemoglu and Johnson (2006) for their analysis on the effect of life expectancy on economic growth between 1940 and 1980.

Used also by Persson and Tabellini (2006).

For the latest updated data, see Maddison Project Database (Bolt, Jutta, and Jan Luiten van Zanden, “The Maddison Project: Collaborative Research on Historical National Accounts,” Economic History Review, 67 (2014), 627–651.)


Barro-Ursua Macroeconomic Data

An attempt to correct Maddison's data. Used by Barro and Ursua "Rare Macroeconomic Disasters" and Barro "Convergence and Modernization Revisited".

Downloadable from Robert Barro's website.

Jones-Klenow well-being measure across countries

Constructed by Jones and Klenow (2016). Quote from their abstract:
We propose a summary statistic for the economic well-being of people in a country. Our measure incorporates consumption, leisure, mortality, and inequality, first for a narrow set of countries using detailed micro data, and then more broadly using multi-country datasets.
Data can be downloaded from the AER website.

Sunday, November 4, 2018

Global Preference Survey

"an experimentally validated survey data set of time preference, risk preference, positive and negative reciprocity, altruism, and trust from 80,000 people in 76 countries" (Falk et al. (2018), abstract)

Introduced by Falk et al. (2018).

Friday, June 1, 2018

Regional ethnic diversity

Alesina and Zhuravskaya (2011) construct ethnic diversity measures at the sub-national region level for 97 countries. The data is available here (click "Download Data Set").

Gershman and Rivera (2018) construct alternative ethnic diversity measures at the sub-national region level for 36 African countries. The data is available as part of the replication files (see Appendix H on the journal webpage).

Thursday, May 24, 2018


See also the Terrain Ruggedness Index.

The best elevation data as of 2016 seems to be WorldDEM, although I haven't seen any application in economics research. It's also not for free of charge. Below is the list of other elevation datasets (available for free of charge) that have been used by economists in the past.


Developed by the U.S. Geological Survey's Center for Earth Resources Observation and Science (EROS) in 1996, GTOPO30 provides elevations at the 30 arc seconds (roughly 1km) grid level. See the USGS/EROS website for detail.

GTOPO30 was used by Deininger and Minten (2002), Nunn and Puga (2007) to measure the degree of ruggedness of the earth surface of each country, and Duflo and Pande (2007) to calculate river gradient in India.

GTOPO30 is now superseded by Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010).


SRTM3 is an updated version of GTOPO30 (I suppose) at a higher spatial resolution of 3 arc-seconds (roughly 100m). SRTM30 is a version that aggregates SRTM3 to the 30 arc second resolution. SRTM30 is supposed to be better than GTOPO30. See Farr et al. (2007) for detail.

For SRTM3 (version 2.1), the data is available here and the documentation is available here. For SRTM30 (version 2.1), both the data and the documentation is available here. For a graphical interface to download the data, visit here.

SRTM30 has been widely used by economists: Taryn Dinkelman's working paper (now published in American Economic Review) "The Effects of Rural Electrification on Employment: New Evidence from South Africa" (to create an instrument for electricity grid placements); Melissa Dell's working paper (now published in Econometrica) "The Persistent Effects of Peru's Mining Mita" (to create control variables); Acemoglu and Dell's paper forthcoming in AEJ Macro "Productivity Differences Within and Between Countries" (to calculate the distance to paved roads that takes into account elevation); Olken (2009) (to obtain the strength of TV signals in each sub-district of Indonesia); and Yanagizawa (2009) "Propaganda and Conflict: Theory and Evidence from the Rwandan Genocide".

How to use GTOPO30 / SRTM30 in ArcGIS

Here is the tip for "How to import GTOPO30 or SRTM30 data into ArcMap (for ArcGIS 9.x)". Step 7 should be skipped because the SRTM30 version 2 uses the value 0 (instead of -9999) for the ocean (see section 1.2 of the documentation).

ASTER Global Digital Elevation Model Version 2

An alternative elevation data to SRTM. Rexer and Hirt (2014) validate SRTM and ASTER against elevation data in Australia, concluding that SRTM is superior in general, with ASTER better for mountainous areas.

Downloadable here.

Used by Mariaflavia Hariri's working paper entitled "Cities in Bad Shape: Urban Geometry in India".

Elevation data is also available by TerrainBase, constructed by National Oceanic and Atmospheric Administration (NOAA) and U.S. National Geophysical Data Center (downloadable at the Atlas of Biosphere). This one is used by Michalopoulos (2008). It is not clear if this is the same as, better or worse than, GTOPO30 and SRTM30. However, if the study area is the whole globe, this data is easier to use because it comes in one file. (GTOPO30 and SRTM30 are provided in several files each of which covers a part of the whole globe.)