Friday, February 5, 2016

Gallup World Poll

"The Gallup World Poll (GWP) is a large cross-country household survey, interviewing more than 100,000 households in over 150 countries, annually or biennially in most countries since 2006." (Clausen et al. 2011, p. 213)


Survey Methodology:

Data access seems to be provided by Gallup Analytics (not clear if it allows access to the respondent-level data, though).

Used by:
  • Deaton (2008), who investigates the relationship between per capita income and subjective well-being across the world, by using the 2006 poll.
  • Clausen et al. (2011), who find a negative correlation between experiencing corruption and having confidence in public institutions.
  • Steptoe, Deaton, and Stone (2015), who investigate the relationship between age and subjective well-being across the world.

Tuesday, January 26, 2016

Datasets on coup d'etat

There are three major coup datasets. I haven't investigated yet which one is better in what respect.

CSP/INSCR Coup Data (Scroll down to find the link to the codebook and data)

It codes not only successful coups but also failed coups and alleged coup plots in countries with population more than 50,000 for 1946-2014.

Powell, Jonathan & Clayton Thyne. 2011. "Global Instances of Coups from 1950-Present." Journal of Peace Research 48(2):249-259

The data has been updated "near-real time". Used by Erik Meyersson's working paper (as of 2015) entitled "Political Man on Horseback: Military Coups and Development".

Coup D'├ętat Project (CDP) by The Cline Center for Democracy

Compiled by the "Big Data" method (see this page for detail).


McGowan (2003) "African military coups d’etat, 1956-2001: frequency, trends and distribution," Journal of Modern African Studies, 41:3, pp. 339–370

For African countries, Patrick J. McGowan has compiled detailed information on every successful and unsuccessful coup and every coup plot from 1955 through 2006 (with sources of information explicitly mentioned).

Sunday, January 24, 2016

DMSP-OLS Nighttime Lights Time Series Version 4

Downloadable here.

The spatial resolution is 30 arc-second (about 1km).

Data construction

To understand how this dataset is constructed from the original satellite images and the potential data issues, see Elvidge et al. (2001) and Elvidge et al. (2010). Noor et al. (2008) is also useful to understand this data.


Min et al (2013) validate this measure against survey-based electricity access measure in rural Senegal and Mali in 2011. Their conclusions (quoted from Min and Gaba 2014, p. 9512) are:

  • Electrified villages are consistently brighter than unelectrified villages across a variety of nighttime satellite images
  • Electrified villages appear brighter in satellite imagery because of the presence of streetlights, and brightness increases with the number of streetlights.
  • The correlation between light output recorded by the satellite with household electricity use and access is low.

Min and Gaba (2014) conduct the same validation exercise for villages in Vietnam in 2013. They reach the same conclusions except for the last point: in Vietnam, household-level access to electricity is also correlated with nighttime light satellite images.

Use in economics research

The data is becoming popular among economists. Recent examples include Henderson et al. (2012), Papaioanno and Michalopoulos (2013, 2014), and Alesina et al. (2012)Hodler and Raschky (2014) exploit the annual panel nature of the data to find that the birth place of a new national leader becomes brighter after he assumes power. Baskaran et al (2015) relate nighttime light to electoral cycles in India.

The raw data ranges from 0 to 63. To be used in regression analysis, there are several ways to aggregate the raw data.
  • Henderson et al. (2012), Papaioanno and Michalopoulos (20132014) and Hodler and Raschky (2014) use the nighttime light data as the measure of living standards. They use the logarithm of the average within each spatial unit of analysis.
    • Logarithmic transformation is used because the distribution of nighttime light intensity is right-skewed with around 10% of observations being zero.
    • Papaioanno and Michalopoulos (20132014) and Hodler and Raschky (2014) add 0.01 to the average before taking log, to use the 10% of the observations without light.
  • Alesina et al. (2012) and Baskaran et al (2015) use the average or sum of light values from all pixels within each spatial unit of analysis divided by population.
  • Baskaran et al (2015) also measure the proportion of villages with the positive value of nighttime light at the village centroid. 

To use this dataset as a panel data, one issue is the compatibility of different satellites in measuring light intensity. Henderson et al. (2012) simply take the average if two satellites provide the data for the same year and control for year fixed effects in regression analysis to account for any differences across years. Alternatively, the following book chapter attempts to calibrate values from different satellites to account for inter-satellite differences and inter-annual sensor decay:
Elvidge, Christopher D., Feng-Chi Hsu, Kimberly E. Baugh and Tilottama Ghosh (2014). "National Trends in Satellite Observed Lighting: 1992-2012." Global Urban Monitoring and Assessment Through Earth Observation. Ed. Qihao Weng. CRC Press.
The calibrated version aggregated to the 0.5x0.5 degree cell level is available as part of the PRIO-GRID data.

Saturday, January 23, 2016

Armed Conflict Location and Event Data (ACLED)

"The ACLED dataset codes exact locations, dates, and additional characteristics of individual battle events in states affected with civil war. There is a specific focus on tracking rebel activity and distinguishing between territorial transfers of military control from governments to rebel groups and vice versa, and the location of rebel group bases, headquarters, strongholds and presence. The dataset also records one-sided violence on civilians by both government or rebel actors and conflicts between rebel groups." (Cited from the above website)

Countries included in the dataset are all countries in Africa and India, Pakistan, Sri Lanka, Nepal, Bangladesh, Bhutan, Cambodia, Laos, Vietnam, Thailand and Myanmar. For other countries, the ACLED website provides the list of datasets (potentially) available.

The sample period starts from 1997.

Comparison to other cross-country conflict datasets can be found here.

Comparison to other conflict datasets for Nigeria and South Africa is available here.

Use in economics research

Armed Conflict Database

The data is freely available here along with the codebook etc.

For each conflict coded in the dataset, the list of scholarly references is now available here.

Miguel et al. (2004) popularize the use of this dataset for measuring civil wars among economists. See this paper for why this dataset may be better than the COW data.

Most recently used by Besley and Persson (2011).

A spin-off data is the Conflict Site Dataset, which provides the centroid and the radius of the zone of each conflict recorded in the Armed Conflict Database. A grid-cell (0.5 x 0.5 arc-degree) version is available as part of the PRIO-GRID data. Used by Campante, Do, and Guimares (2014), to show that conflicts are more likely to occur in the areas closer to the capital city. While the data quality should be worse than ACLED, it is useful if research concerns those countries not covered by ACLED.


MICRA2000 is a spatial dataset on the crop type grown in 2000, with the beginning and the end months of the growing season, at the 5 arc-minute cells across the globe.

Cross-country panel data on revolutions

The best source appears to be Arthur Banks's Cross-National Time Series Data Archive, which includes not only successful revolutions but also attempted ones.

An alternative source, only including successful revolutions for the period 1972-1998, is Goldstone, Jack. 1998. The Encyclopedia of Political Revolutions. Washington, DC: Congressional
Quarterly. Used by Albertus and Menaldo (2014).