Monday, August 15, 2016

DMSP-OLS Nighttime Lights Time Series Version 4

Downloadable here. The spatial resolution is 30x30 arc-second (about 1x1 km) across the globe between 75 degrees north and 65 degrees south. Available annually since 1992 (and up to 2013, as of August 2016). The nighttime light intensity in each cell is represented by the "digital number", an integer from 0 to 63.

For a quick summary of the dataset, see Section I of Henderson et al. (2012). For detailed discussion on the data, see Doll (2008).

The data is becoming popular among economists.

Henderson et al. (2012) and Pinkovskiy and Sala-i-Martin (2016) use nighttime light to improve the data on national accounts GDP.

Michalopoulos and Papaioannou (20132014), and Alesina et al. (2016) use nighttime light as a measure of living standards across African ethnic groups.

Hodler and Raschky (2014) exploit the annual panel nature of the data to find that the birth place of a new national leader becomes brighter after he assumes power.

Baskaran et al (2015) relate nighttime light to electoral cycles in India.

Storeygard (2016) uses light as a measure of city-level income across cities in Africa.

Bleakey and Lin (2012) use nighttime light as a measure of spatial distribution of contemporary economic activity, to see whether portage sites still predict where economic activities are concentrated today, long after their original advantage became obsolete.

Data construction

To understand how this dataset is constructed from the original satellite images and the potential data issues, see Elvidge et al. (2001) and Elvidge et al. (2010). Noor et al. (2008) is also useful to understand this data. See also Alexei Abrahams's guest post for Development Impact Blog.

Data issues

Digital number: it's "not exactly proportional to the physical amount of light received (called true radiance)," quoted from p. 999 of Henderson et al. (2012).

Top-coding: The maximum value of light intensity is 63. This issue shouldn't matter much for poor and middle-income countries. Henderson et al. (2012) remove Singapore and Bahrain from their cross-country analysis for this concern (see footnote 16)

Bottom-censoringHenderson et al. (2012) notes that there are "remarkably few pixels with digital numbers of 1 or 2" (p. 1000). Storeygard (2016) describes how the data processing algorithm causes bottom-censoring (see Appendix section A.8).

Compatibility across years and satellites: Satellite sensors age over time and are replaced periodically. Thus, the same digital number does not necessarily mean the same level of light intensity across years and satellites. Henderson et al. (2012) deal with this concern by controlling for year fixed effects in a regression of log GDP on log light per area.
  • Alternatively, the following book chapter attempts to calibrate values from different satellites to account for inter-satellite differences and inter-annual sensor decay:
    • Elvidge, Christopher D., Feng-Chi Hsu, Kimberly E. Baugh and Tilottama Ghosh (2014). "National Trends in Satellite Observed Lighting: 1992-2012." Global Urban Monitoring and Assessment Through Earth Observation. Ed. Qihao Weng. CRC Press. (The working paper version is available here.)
    • The calibrated version aggregated to the 0.5x0.5 degree cell level is available as part of the PRIO-GRID data.
Gas flare: The digital number picks up gas flare caused by oil production. Henderson et al. (2012) drops Equatorial Guinea from their cross-country analysis for this reason (footnote 16). In one of their robustness checks, Henderson et al. (2012) also drop pixels within gas flare polygons, so does Storeygard (2016).

Blooming: Light tends to be magnified over certain terrain types such as water and snow cover.

Blurring: A single point source of light would be recorded in several neighbouring cells due to the way the satellite sensor captures the light emission. See Alexei Abrahams's guest post for Development Impact Blog for more detail.

High latitude locations: Due to long daytime length, nighttime light cannot be observed in summer for high latitude locations (the raw satellite images are taken between 8:30 and 10:00 pm local time). For this reason, Henderson et al. (2012) exclude observations north of the Arctic Circle.


Validation as a measure of income/wealth

Logarithm of light intensity per area (and its long-run change over the 15-year period) is known to be linearly correlated with
Logarithm of light intensity per capita is known to be linearly correlated with
Pinkovskiy and Sala-i-Martin (2016) (p. 609) calibrate the exponent on the digital number to match the average income of the states in Mexico (obtained from Luxembourg Income Study). They note (fn. 20), "We allow the calibrated exponent to differ across years, but in no year is it smaller than 5/2, and in some years it is as large as 9. Therefore, it is likely that the specification that is prevalent in the literature (setting the exponent equal to unity) is incorrect."

Validation as a measure of public goods provision

Michalopoulos and Papaioannou (2014) shows that logarithm of light intensity per area is correlated with access to electrification, presence of a sewage system, access to piped water, and education (averaged across households in each enumeration area) from Afrobarometer Surveys in 17 African countries.

Min et al (2013) validate this measure against survey-based electricity access measure in rural Senegal and Mali in 2011. Their conclusions (quoted from Min and Gaba 2014, p. 9512) are:
  • Electrified villages are consistently brighter than unelectrified villages across a variety of nighttime satellite images
  • Electrified villages appear brighter in satellite imagery because of the presence of streetlights, and brightness increases with the number of streetlights.
  • The correlation between light output recorded by the satellite with household electricity use and access is low.
Min and Gaba (2014) conduct the same validation exercise for villages in Vietnam in 2013. They reach the same conclusions except for the last point: in Vietnam, household-level access to electricity is also correlated with nighttime light satellite images.

See also Chen and Nordhaus (2011).


Aggregation methods

The raw data ranges from 0 to 63 at the 30x30 arc-second cells. To be used in regression analysis, there are several ways to aggregate the raw data.
  • Henderson et al. (2012) (see footnote 7) obtain the weighted average across pixels within a country, where the weight is the land area of each 30x30 arc-second pixel, obtained from CIESIN/IFPRI/CIAT (2004).
  • Michalopoulos and Papaioannou (20132014) and Hodler and Raschky (2014) use the logarithm of light intensity per area within each spatial unit of analysis.
    • Logarithmic transformation is used because the distribution of nighttime light intensity is right-skewed with around 10% of observations being zero.
    • 0.01 is added to the average before taking log, to use the 10% of the observations without light.
  • Alesina et al. (2016) and Baskaran et al (2015) use the average or sum of light values from all pixels within each spatial unit of analysis divided by population.
  • Baskaran et al (2015) also measure the proportion of villages with the positive value of nighttime light at the village centroid. 
  • Storeygard (2016) measure the city-level light intensity as follows: first convert the original data "into one binary grid encoding whether a pixel was lit in at least one satellite-year. These ever-lit areas were then converted to polygons; contiguous ever-lit pixels were aggregated, and their DNs were summed within each satellite-year." (p. 1268)

2 comments:

Anonymous said...

See also Weidmann and Schutte 2016: http://jpr.sagepub.com/content/early/2016/05/05/0022343316630359.full.pdf

Yongwei Nian said...

Is there any dataset about nighttime light that has a finer resolution (such as 250m*250m) than DMSP/OLS? I want to use night light at the firm-level. Many thanks!