Poverty from Space: Using High Resolution Satellite Imagery for - - PDF document

poverty from space using high resolution satellite
SMART_READER_LITE
LIVE PREVIEW

Poverty from Space: Using High Resolution Satellite Imagery for - - PDF document

136 65 52 53 54 55 56 57 58 59 60 61 62 63 64 66 50 67 68 1 69 70 71 51 49 73 33 20 21 22 23 24 25 26 27 28 29 30 31 32 34 48 35 36 37 38 39 40 41 42 43 44 45 46 47 72 74 18 121 108 109


slide-1
SLIDE 1

Poverty from Space: Using High Resolution Satellite Imagery for Estimating Economic Well-being

Ryan Engstrom1, Jonathan S Hersh2, David Newhouse3

1George Washington University, 2Chapman University, 3World Bank, Poverty Global Practice

Can features extracted from high spatial resolution satellite im- agery accurately estimate poverty and economic well-being? We investigate this question by extracting both object and texture features from satellite images of Sri Lanka, which are used to esti- mate poverty rates and average log consumption for 1,291 admin- istrative units (Grama Niladhari (GN) Divisions). Features extracted include the number and density of buildings, the prevalence of building shadows (a proxy for building height), the number of cars, density and length of roads, type of agriculture, roof material, and a suite of texture and spectral features calculated using a non-

  • verlapping box approach. A simple linear regression model, using
  • nly these inputs as explanatory variables, explains nearly sixty

percent of both poverty headcount rates and average log con-

  • sumption. Estimates remain accurate throughout the GN average

consumption distribution. Two sample applications, extrapolating predictions into adjacent areas and estimating local area poverty using an artificially reduced census, confirm the out of sample predictive capabilities.

poverty estimation | machine learning | remote sensing | economic development | satellite imagery

  • 1. INTRODUCTION

Despite the best efforts of national statistics offices and the international development community, local area estimates of poverty and economic welfare remain rare. Between 2002 and 2011, as many as 57 countries conducted zero or only one survey capable of producing poverty statistics, and data are scarcest in the poorest countries [1]. But even in countries where data are collected regularly, household surveys are typically too small to produce reliable estimates below the district level. Generating welfare estimates for smaller areas require both a household welfare survey and contemporaneous census data, and the latter is typically available once per decade at best. Furthermore, safety concerns may prohibit survey data collection in many conflict areas altogether. This paper investigates the ability of object and texture fea- tures derived from HSRI (High Spatial Resolution Imagery) to estimate and predict poverty rates and consumption at local

  • levels. The area of our study covers 3,500 square kilometers in Sri

Lanka, which contain 1,291 administrative units (Grama Nilad- hari (GN) divisions), which are on average 2.15 sq. km each. For each GN, we extract both object, spectral, and texture features to use as explanatory variables in poverty prediction models. Object features extracted include the number of cars, number and size of buildings, type of agriculture (plantation vs. paddy), the type of roof, the share of shadow pixels (a proxy for building height), road extent and road material. These features are iden- tified using a combination of deep learning based Convolutional Neural Networks (CNN) and eCognition, an object based image processing software. Texture features that characterize the spatial variability in an area or neighborhood within an image were also

  • calculated. These satellite derived features were then matched to

household estimates of per capita consumptions imputed into the 2011 census for the 1,291 areas. We investigate four main questions: 1) To what extent can variation in GN economic well-being -- poverty rates defined at the 10 and 40th percentiles of national income and average village consumption -- be explained by high spatial-resolution features? 2) Which features are most strongly associated with these measures of well-being? 3) Can fitted models predict into geographically adjacent areas out of sample? and 4) Are predic- tions robust to the use of a smaller training sample data sets? We find that: i) satellite features are predictive of economic well- being and explain about sixty percent of the variation in both GN average consumption and estimated poverty headcount rates; ii) Measures of built-up area and roof type strongly correlate with welfare; iii) Predicting into adjacent areas produces less accurate poverty measures, but ranking between true and predicted rates is moderately high; and iv) Using a one percent sample of the census based “ground truth” designed to mimic the sampling strategy of the Household Income and Expenditure Survey has little impact

  • n the accuracy of the prediction.

Daytime imagery has recently emerged as a practical source

  • f information on economic well-being [2]. Advances in Deep

Learning such as Convolutional Neural Networks (CNN) have the capability to algorithmically classify objects such as cars, building area, roads, crops and roof type [3]. These objects may be more strongly correlated with local income and wealth than Night Time Lights (NTL) [4]. Furthermore, textural and spectral algorithms provide spatial context [5-6] that may be relevant for poverty estimation. In textural and spectral algorithms, the spatial and spectral variations in imagery are calculated over a neighborhood or non-overlapping block of pixels to characterize the local spatial pattern of the objects observed in the imagery. Previous researchers [7] have employed a transfer learning approach to estimate poverty, in which a set of 4,096 unstructured features are extracted from the penultimate layer of a Convolu- tional Neural Network that uses Google Earth daytime imagery to predict the luminosity of NTL. The resulting model predicts well and explains an average of 46 percent of the variation in village per capita consumption, out of sample, across the four Significance

Estimates of local area poverty remain rare in the developing

  • world. Day-time satellite imagery holds promise for filling data

gaps of economic well-being. Using a training site in Sri Lanka, we extract objects (cars, roof type, roads) and textures from satellite images and use these to build models of poverty and

  • income. We find that these models explain 60 percent or more
  • f the variation in poverty or income. The poverty estimates

generated by our method are accurate for the poorest villages.

Reserved for Publication Footnotes 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68

1

69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136

slide-2
SLIDE 2
  • Fig. 1. Example Developed Area (Buildings) Classification. (A) Raw satellite
  • imagery. (B) Classified image showing developed area building classifier.

Areas in green show are true positive building classifications. Images in red are false positives

  • Fig. 2. Example Car Classification. Cars classified by CNN are shown circled in

blue.

countries in which it was trained. While this innovative use of daytime imagery substantially improves on the use of NTL alone, it is not necessarily optimal for predicting poverty rates. When the top two quintiles are excluded from their sample, restricting the sample to those below twice the international poverty line, the R-squared falls precipitously, to about 0.12. This illustrates the challenges this method faces in in distinguishing welfare among the poorest of the poor, who in the African context most likely live in places of relative darkness.

  • 2. DATA

Our analysis is restricted to a sample area of approximately 3,500 km2 in Sri Lanka (see figure S5). National coverage was not feasible due to the high cost and partial availability of high- resolution imagery, however these data are rapidly becoming more available and less expensive as companies such a Planet and DigitalGlobe expand their archives. We sampled DS Divi- sions conditional on HSRI being available, drawing areas from urban, rural, and estate sectors.[1] According to the 2012 census, population by sector in Sir Lanka is rural (77.4%), urban (18.2%) and estate (4.4%) [8]. Population by sector in our sample is rural (45.9%), urban (46.2%) and estate (7.8%). 2.1.1 Details on Satellite Imagery The satellite imagery consists of 55 unique “scenes” purchased from Digital Globe, covering areas specified in our sample area. Each “scene” is an individual image captured by a particular sensor at a particular time. Images were acquired by three dif- ferent sensors: Worldview 2, GeoEye 1, and Quickbird 2. These sensors have a spatial resolution of 0.46m2, 0.41m2, and 0.61m2, respectively in the panchromatic band and 1.84m2, 1.65m2, 2.4m2 respectively in the multi-spectral bands. Pre-processing of im- agery included pan-sharpening, ortho-rectification, and image

  • mosaicking. Most imagery was captured in either 2011 or 2012,

although some imagery from 2010 was also used. 2.1.2 Details on Local Area Poverty Training Data Ideally village poverty and consumption statistics would be generated directly from the 2012/13 Household Income and Ex- penditure Survey (HIES), a detailed survey that measures the consumption patterns of 25,000 households on approximately 400 consumption items. The survey contains an average of 8.4 house- holds per GN Division in the 47 sampled DS Divisions, making the HIES insufficient to generate consistent poverty estimates at the GN Division without supplementary data. We therefore draw

  • n the most common method to impute welfare estimates [9] into

the 2011 Census of Population and Housing, which is identical to the method used to generate official poverty estimates at the DS Division level [10][2]. For each household in the census, per capita consumption was estimated based on models developed from the HIES, using household indicators that are common to both the Census and the HIES.[3] We derive GN headcount poverty rates using the standard Foster-Greer-Thornbecke method [11], for two poverty lines: poverty line 1 at the 10th percentile of the national per capita consumption distribution, and poverty line 2 at the 40th percentile. This is equivalent to $3.00 and $5.13 per day respectively in 2011 PPP terms, which compares to an extreme poverty line in 2011 prices of $1.90 per day. 2.2 Feature Extraction The derived high spatial resolution features fall into seven broad categories: (1) agricultural land; (2) cars; (3) building density and vegetation; (4) shadows; (5) road and transportation; (6) roof type; and (7) textural and spectral characteristics. In addition to the satellite features, we use two geographic attributes

[1] Sri Lanka classifies sectors as urban, rural, or estate. The estate sector refers to plantation areas of more than 20 acres with 10 or more residential laborers. Except for sample stratification, the estate sector is grouped with the rural sector. [2] The term welfare is used interchangeably with per capita household consumption. [3] Consumption aggregates have been spatially deflated using a district level food price index constructed from unit values in the HIES survey by the Department of Census and Statistics.

137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204

2

205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272

slide-3
SLIDE 3

Table 1. Prediction of local area poverty rates using high resolution satellite features. Dependent variable is poverty rate using poverty line 1, defined at 10th percentile of national income Variable Coef t-statistic log Area (square meters) 0.020* [2.52] = 1 if urban

  • 0.023

[-1.80] % of GN area that is agriculture

  • 0.00025

[-1.04] % of GN agriculture that is paddy

  • 0.00033**

[-2.97] % of GN agriculture that is plantation

  • 0.00021**

[-2.84] % of Total GN area that is paddy

  • 0.00019

[-0.58] Total cars divided by total road length

  • 0.31

[-1.17] Total cars divided by total GN Area 29.6 [0.54] log number of cars

  • 0.0059

[-0.89] log sum of length of roads

  • 0.020***

[-3.64] fraction of roads paved

  • 0.00035***

[-4.24] ln length airport roads

  • 0.0051

[-1.45] ln length railroads 0.00098 [1.31] % of area with buildings

  • 0.0027*

[-2.31] log of Total count of buildings in GN

  • 0.0090**

[-2.71] Vegetation Index (NDVI), scale 64 0.061* [2.20] Vegetation Index (NDVI), scale 8

  • 0.064**

[-2.80] % shadows (building height) 0.0022* [2.04] ln shadow pixels (building height) 0.016* [2.51] Fraction of total roofs that are clay 0.00077** [3.35] Fraction of total roofs that are aluminum 0.00091*** [3.63] Fraction of total roofs are asbestos

  • 0.00033

[-1.08] Linear Binary Pattern Moments (scale 32m) mean 0.0021** [2.91] Line support regions (scale 8m), mean

  • 0.66

[-0.87] Gabor filter (scale 64m) mean

  • 0.052

[-1.53] Fourier transform, mean 0.0017** [3.42] SURF (scale 16m), mean

  • 0.0014

[-0.94] Constant

  • 0.32**

[-3.03] Observations 1291 R-sq 0.610 R-sq Adj. 0.602 Out-of-Sample R-sq 0.588

  • f the GN Division: whether it is administratively classified as an

urban area, and its area in square kilometers. The agricultural land indicators are coarse, and consist only of the fraction of GN agriculture identified as paddy (rice cultivation) or plantation (cash crops such as tea). These sum to one hundred percent for GNs with agricultural land, so the excluded category in subse- quent regressions is GN Divisions with no agricultural land. We also calculated the fraction of total GN area that is either paddy, plantation, or any agriculture. Three car related variables were calculated – the log total number of cars in a GN, total cars divided by total road length, and cars per square kilometer of the GN. Building density variables include the fraction of an area covered by built-up area and the number of roofs identified. Built-up area captures any human settlements – buildings, homes, etc. – regardless of use or con-

  • dition. These are grouped with two measures of the normalized

difference vegetation index (NDVI), a measure of vegetation greenness [12] that indicates a lack of building presence in urban

  • areas. The fifth category are two indicators that capture shadows:

the log of the number of pixels classified as shadow as well as the fraction of shadows covering a valid area in a GN. 2.2.1 Object Classification Details Object features were classified using the assistance of two technical partners: Orbital Insight and LandInfo. Deep learning- based object classification was used for classifying the share of the GN division that is built-up (i.e. consists of buildings), the number of cars in the GN, and the share of pixels in the GN that were identified as shadow pixels (proxy building heights), and crop type. The classification method used is similar to Krizhevsky, Sutskever, and Hinton (2012), which utilizes convolutional neural networks (CNN) to build object predictions from raw imagery. Roof type, paved and unpaved roads of different widths, and rail- roads were classified using a combination of Trimble eCognition and Erdas Imagine software, utilizing a combination of support vector machines and visual identification. The CNN classification algorithm involved four steps:

  • 1. Ingestion/tiling
  • 2. Model development
  • 3. Classifying all pixels using the trained model
  • 4. Aggregating prediction results to GN division level

The tiling stage splits the large images into many small images

  • r tiles, in order to make the modeling computationally scalable,

as each tile could be distributed to a different GPU core for greater efficiency. In the model development stage, the classifi- cation model was trained and tuned. Model building began by manually classifying or labeling a sub-sample of the imagery as a positive or negative value for a given object using a crowdsourced

  • campaign. The classified data was split into an 80% training and

a 20% testing set, where the training set was used to build the

  • model. This allowed sample prediction metrics to be calculated

using the withheld test set. Training was run for 60,000 iterations using the Nesterov solver method, a variant of stochastic gradient descent. Figure 1 shows an example of a developed area building clas- sification, with raw image shown at the top and CNN classification accuracy shown below. On the bottom panel, true positives are 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340

3

341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408

slide-4
SLIDE 4
  • Fig. 3. Model diagnostic plot of predicted against true average GN consump-

tion

highlighted green, with false positives highlighted red. Figure 2 shows a sample car classification. Cars that are positively identi- fied are shown circled in blue. False negatives are most prevalent where there is considerable tree masking of pixels. All classifiers achieve an accuracy of greater than 90%. The example ROC curve for buildings is shown in supplemental figure S6. Example road and roof classifications are shown in supplemental figures S7 and S8. 2.2.2 Textural and Spectral Extraction Details Textural and spectral features were created using a non-

  • verlapping block approach with block sizes of eight pixels and

scales of 8, 16, 32, and 64 meters. This resulted in an output image comprised of 165 bands at a spatial resolution of 12.8 or 16m depending on native, multispectral resolution of each of the

  • sensors. Additional spectral features calculated were simply the

means of the four individual bands: red, blue, green, and near-

  • infrared. Once the spatial and spectral features were calculated,

the mean, standard deviation, and sum were determined for each GN Division. The full list of textural and spectral features extracted include: Histogram of oriented gradients (HOG) [13]; PanTex [14]; Line support regions (LSR) [15]; Local binary pat- terns moments (LBPM) [16]; Fourier transform (FT) [17]; Gabor [18]; Speeded Up Robust Features (SURF) [19]; and normalized difference vegetation index [20]. An example PanTex classifica- tion is available in figure S9.

  • 3. RESULTS

We estimated separate least squares linear models using as dependent variables average GN consumption, and headcount poverty rates defined at the two poverty lines. LASSO [21] was used to select variables in the first stage (over the full set of HRSF extracted variables), followed by estimating an OLS model

  • ver the set of non-zero coefficients, a procedure referred to as

“Post-LASSO” [22]. Table 1 presents coefficient estimates for a model of poverty rates defined at poverty line 1. Many extracted satellite features have high explanatory power, including agri- culture type, length of roads, fraction roads paved, number and density of buildings, NDVI, roof type, shadows (building height proxy) and two spatial features, LBPM, and FT. Table S3 presents coefficients for all three models estimated with the three separate dependent variables. The model’s explanatory power is high, summarized in the in- sample R­-squared of value of 0.610. Out-of-sample R-squared, estimated using ten-fold cross-validation [23], is 0.588, indicating the model is not likely to be overfit. In words, a simple linear model that includes only the geographic size of the GN Division, whether it is urban, and remotely sensed information explains 61 percent of the variation across GNs in headcount poverty rates. Figure 3 plots predicted against true average GN consumption, with colors assigned by province in which the GN is located. A LOWESS smoothing line is shown with associated confidence

  • interval. The model has a tendency to under-predict for GNs with

very high average incomes. While there is noise, the predictions tend to straddle the 45° line indicating a high degree of agree- ment between the predicted and true welfare values. Figure S10 presents a map of predicted and true values for a sample area. 3.1 Decomposition of Feature Explanatory Power The results above do not distinguish the degree to which individual indicators improve the model’s predictive power. To address this question, we use a Shapley decomposition [24] to de- compose the model’s explanatory power. The Shapely procedure calculates the marginal R-squared of a set of explanatory variables as the amount by which R-squared declines when removing that set from the set of variables. The results (table S4) confirm that measures of building density – built up area, number of buildings, shadow pixels, and to a lesser extent vegetation – are powerful contributors to predictive power. Collectively, these three sets of variables account for 37 to 40 percent of the model’s explanatory

  • power. NTL, including average, squared, cubed, and average

standard deviation of NTL, explain between 7 and 12 percent of the variance in per capita consumption or poverty, suggesting that HRSF capture approximately an additional 90 percent of on the ground poverty or income in comparison to NTL. 3.2 Model Performance at Varying Income Levels To verify the model performs well for the very poor, we divide the sample of GN Divisions into quintiles based on the mean predicted per capita consumption of census households, and re- estimate the main model for log per capita consumption on the subsample of the bottom 80, 60, 40, and 20 percent of the dis-

  • tribution. Model performance across income quintiles are shown

in Table 2. Overall, the model continues to predict well within the poorest subsamples, as the adjusted R-squared declines only mildly from 0.60 in the full sample to 0.579 when only considering the bottom decile. Given that the poorest decile of GNs have an average welfare of $4.67 per day, this represents a little more than double the international poverty line. This suggests that this approach for estimating welfare from high-resolution satellites images is accurate for fairly poor contexts.

  • 4. APPLICATIONS

4.1 Estimating Poverty Using a Reduced Census The standard small area estimation technique used to model poverty combines a smaller household survey with a population

  • census. Can we combine satellite features with a smaller house-

hold survey alone to produce sufficiently precise small area esti- mates? To assess this, we examine whether the predictive power

  • f satellite imagery remains when it is calibrated using a census

extract, of approximately the size of the Household Income and Expenditure Survey, rather than a full census. We produce several simulations of the dependent variable (either per capita consumption or GN poverty rate) using sub- samples of the census intended to mimic the size of a household

  • survey. For each subsample only 20% of GNs are sampled. The

number of households within each subsample that are “surveyed” (i.e., used to produce the training set’s poverty statistic). We sample either 5%, or 100% of the actual households in that

  • GN. For each simulated sample we build a model of poverty

409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476

4

477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544

slide-5
SLIDE 5

Table 2. Performance of high resolution spatial feature models of poverty rates, at different quintiles of average GN Division income Bottom 10% Bottom 20% Bottom 40% Bottom 60% Bottom 80% N 130 259 517 775 1033 R-sq 0.616 0.541 0.450 0.475 0.516 Adjusted R-sq 0.558 0.509 0.431 0.462 0.507 Out-of-sample R-sq 0.579 0.473 0.406 0.453 0.483 Mean Absolute Error 0.0554 0.0648 0.0791 0.0901 0.115 GN: Mean welfare (2011 PPP $) 4.667 5.078 5.647 6.176 6.793 Max welfare (2011 PPP $) 5.282 5.747 6.66 7.806 10.04

  • Fig. 4. Model explanatory power with artificially reduced sample size. 20%
  • f GNs sampled to estimate model.Households sampled as shown.

using HRSF, producing estimates of poverty that we can then compare to actual estimates. The poverty rate of a GN in the training data will become less precise the fewer households that are sampled per GN, although survey costs increase with the number of households surveyed. Figure 4 plots the results of the simulation exercise, where we have plotted R-squared values between predicted welfare rates and true welfare rates, both in-sample (GNs within the subsample) and out-of-sample (GNs excluded from the subsam- ple), and mean absolute error. Average R-squared values between predicted and true values do not depreciate significantly when using the sample consisting of 20% of GNs and 5% of households within those GNs. 4.2 Estimates of Poverty via Geographic Extrapolation A major motivation for using satellite imagery is to extrapo- late poverty estimates into areas where survey data on economic well-being do not exist. While most of the data deprivation that characterizes the developing world occurs at the country level, it is also common for surveys to omit selected regions, due to political turmoil, violence, animosity towards the central government, or prohibitive expense. For example, from 2002 through 2009/10, Sri Lanka’s HIES failed to cover certain districts in the North and Eastern part of the country due to civil conflict, and Pakistan’s HIES exclude the Federally Administered Tribal Areas, Jammu and Kashmir. To assess how well a model “travels” to a different geographic area, we fit a series of models, where in each model we exclude a single Divisional Secretariat (DS), a larger administrative area, from the model, and use the estimated model to predict into that excluded area. This is a form of “leave-one-out cross-validation” (LOOCV), a common method used to infer statistical out-of- sample performance [25]. We estimate both linear models and random forest models[4] to predict out of sample to determine if more flexible model specifications perform better out-of-sample. Table S5 shows model performance at predicting into ad- jacent areas, comparing normalized root mean squared error, normalized mean absolute error, and the correlation between predicted and true welfare rates using both random forest and linear models to fit HRSF models. The adjacent prediction error rates are larger than when predicting randomly out of sample using cross-validation. Normalized error rates divide average error by the average value of welfare, therefore the error rates can be interpreted as fractions of average welfare. Mean absolute error is estimated at 2.5% of log household consumption, 45%

  • f the average poverty rate at the lower poverty line, and 30%
  • f the average poverty rate at the higher poverty line using

linear models to estimate and predict into adjacent areas. The error rates are lower when using random forests to estimate and predict into adjacent areas. When predicting and predicting using random forest models mean absolute error declines to 1.5% of log household consumption, 38% of the average poverty rate at the lower poverty line, and 25% of the higher poverty line. While these error rates imply adjacent predictions will be too imprecise for producing welfare measures intended as official statistics, they may be sufficient for generating rank ordering of villages by poverty or income. We conclude from these results that HRSF cannot yet be used to predict accurately into adjacent areas for official statistics, but the accuracy may be acceptable for targeting or other applications.

  • 5. DISCUSSION

Traditionally, given the prohibitive cost of conducting surveys sufficiently large to provide accurate statistics for small areas, generating small area poverty estimates requires pairing a welfare survey with a census or intercensus survey. Census and inter- census data is expensive to collect and therefore produced rela- tively infrequently. It is also usually disseminated with a lag, mak- ing it difficult to rapidly assess changes in local living standards. The results above show that indicators derived from high spatial resolution imagery, when paired with survey data, generate ac- curate predictions of local level poverty and welfare, and that by and large the conditional correlations are of sensible signs and

  • magnitudes. Furthermore, predictions based on specific features

accurately predict mean per capita consumption throughout the welfare distribution. While the welfare consequences of more fre- quent measures of poverty and inequality are unknown, they may be large given the many applications of frequent local measures of economic well-being, ranging from impact evaluation, to budget allocation to social transfers. These findings raise questions for further work, and con- tribute to an ongoing discussion regarding the use of predictive

[4] For each random forest model we use 1000 decision trees, sampling 13 of the predictors with replacement.

545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612

5

613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680

slide-6
SLIDE 6

methods in public policy [26]. The most immediate of these is whether satellite indicators can substitute for census data in different contexts and for different indicators. Second, it is im- portant to better understand the extent to which these results generalize to different social and ecological environments, such as Africa, the Middle East, and other parts of Asia. There is no guar- antee that the predictive power of building density, shadows, and

  • ther features documented above will hold in all environments.

A second line of research could explore whether changes in satellite imagery could be used to forecast changes in economic well-being across space and time. Poverty surveys are typically collected every three years and the most recent global estimates are produced with a three-year lag. Therefore, the ability to “now- cast” measures of economic well-being by combining frequently updated satellite imagery with the most recent survey-based mea- sures of poverty has great potential. Additional research can shed light on identifying the best way of predicting into adjacent areas not covered by surveys. Overall, the inevitable increase in the availability of imagery and feature identification algorithms, in conjunction with the encouraging results from this study, implies that satellite imagery will become an increasingly valuable tool to help governments and stakeholders better understand the spatial nature of poverty.

  • 6. ACKNOWLEDGMENTS

This project benefited greatly from discussions with Susan Athey, Sarah Antos, Ana Areias, Marianne Baxter, Sam Bazzi, Azer Bestavros, Kristen Butcher, John Byers, Francisco Fer- reira, Ray Fisman, Alex Guzey, Klaus-Peter Hellwig, Kristen Himelein, John Hoddinott, Tariq Khokhar, Trevor Monroe, Dilip Mookherjee, Pierre Perron, Bruno Sánchez-Andrade Nuño, Ki- wako Sakamoto, David Shor, Benjamin Stewart, Andrew Whitby, Nat Wilcox, Nobuo Yoshida and seminar participants at the Boston University Development Reading Group, Chapman Uni- versity, The World Bank, and the Department of Census and Statistics of Sri Lanka. All remaining errors in this paper re- main the sole responsibility of the authors. Sarah Antos, Ben- jamin Stewart, and Andrew Copenhaver provided assistance with texture feature classification. Object imagery classification was assisted by James Crawford, Jeff Stein, and Nitin Panjwani at Orbital Insight, and Nick Hubing, Jacqlyn Ducharme, and Chris Lowe at Land Info, who also oversaw imagery pre-processing. Hafiz Zainudeen helped validate roof classifications in Colombo. Colleen Ditmars and her team at DigitalGlobe facilitated imagery acquisition, Dung Doan and Dilhanie Deepawansa developed and shared the census-based poverty estimates, and we thank

  • Dr. Amare Satharasinghe for authorizing the use of the Sri

Lankan census data. Liang Xu provided research assistance. Zubair Bhatti, Benu Bidani, Christina Malmberg-Calvo, Adarsh Desai, Nelly Obias, Dhusynanth Raju, Martin Rama, and Ana Revenga provided additional support and encouragement. The authors gratefully acknowledge financial support from the Strate- gic Research Program and World Bank Big Data for Innovation Challenge Grant, and the Hariri Institute at Boston University. The views expressed here do not necessarily reflect the views of the World Bank Group or its executive board, and should not be interpreted as such.

  • 7. REFERENCES

[1] Serajuddin, U., Uematsu, H., Wieser, C., Yoshida, N., & Dabalen, A. (2015). Data deprivation: another deprivation to

  • end. World Bank Policy Research Working Paper, (7252).

[2] Donaldson, Dave and Adam Storeygard. 2016. "The View from Above: Applications of Satellite Data in Economics." Jour- nal of Economic Perspectives, 30(4): 171-98. [3] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097- 1105). [4] Elvidge, C. D., Imhoff, M. L., Baugh, K. E., Hobson, V . R., Nelson, I., Safran, J., ... & Tuttle, B. T. (2001). Night-time lights

  • f the world: 1994–1995. ISPRS Journal of Photogrammetry and

Remote Sensing, 56(2), 81-99. [5] J. Graesser, A. Cheriyadat, R. R. Vatsavai, V . Chandola,

  • J. Long, and E. Bright, “Image based characterization of formal

and informal neighborhoods in an urban landscape,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 5, no.4, pp. 1164- 1176, Jul, 2012. [6] Engstrom, R., Sandborn, A., Yu, Q. Burgdorfer, J., Stow, D., Weeks, J., and Graesser, J. (2015) Mapping Slums Using Spatial Features in Accra, Ghana. Joint Urban and Remote Sensing Event Proceedings (JURSE), Lausanne, Switzerland, 10.1109/JURSE.2015.7120494 [7] Jean, N., Burke, M., Xie, M., Davis, W. M., Lobell, D. B., & Ermon, S. (2016). Combining satellite imagery and machine learning to predict poverty. Science, 353(6301), 790-794. [8] Sri Lanka Department

  • f

Census and Statistics. (2012) “Census

  • f

population and housing”, available at:http://www.statistics.gov.lk/PopHouSat/CPH2011/Pages/A- ctivities/Reports/SriLanka.pdf [9] Elbers, C., Lanjouw, J. O., & Lanjouw, P . (2003). Micro–level estimation of poverty and inequality. Econometrica, 71(1), 355-364. [10] Department of Census and Statistics and World Bank, 2015 “The Spatial Distribution of Poverty in Sri Lanka”, available at: http://www.statistics.gov.lk/poverty/SpatialDistributionOfP-

  • verty2012 13.pdf

[11] Foster, James; Joel Greer; Erik Thorbecke (1984). "A class of decomposable poverty measures". Econometrica. 3. 52: 761–766. [12] Rouse Jr, J W, et al. "Monitoring vegetation systems in the Great Plains with ERTS." (1974). [13] Dalal, N., & Triggs, B. (2005, June). Histograms of

  • riented gradients for human detection. In Computer Vision and

Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on (Vol. 1, pp. 886-893). IEEE. [14] Pesaresi, M., Gerhardinger, A., & Kayitakire, F. (2008). A robust built-up area presence index by anisotropic rotation- invariant textural measure. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 1(3), 180-192. [15] Yu, W. P ., Chu, G. W., & Chung, M. J. (1999). A robust line extraction method by unsupervised line clustering. Pattern recognition, 32(4), 529-546. [16] He, D. C., & Wang, L. (1991). Texture features based on texture spectrum. Pattern Recognition, 24(5), 391-399. [17] Smith, S. W. (1997). The scientist and engineer's guide to digital signal processing. Chicago [18] Gabor, D. (1946). Theory of communication. Part 1: The analysis of information. Journal of the Institution of Electrical Engineers-Part III: Radio and Communication Engineering, 93(26), 429-441. [19] Bay, H., Tuytelaars, T., & Van Gool, L. (2006). Surf: Speeded up robust features. Computer vision–ECCV 2006, 404- 417. [20] Tucker, C. J., Elgin Jr, J. H., McMurtrey Iii, J. E., & Fan,

  • C. J. (1979). Monitoring corn and soybean crop development with

hand-held radiometer spectral data. Remote Sensing of Environ- ment, 8(3), 237-248. [21] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267-288. [22] Belloni, Alexandre and Chernozhukov, V . (2013). “Least squares after model selection in high-dimensional sparse models”

  • Bernoulli. 19(2).

681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748

6

749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816

slide-7
SLIDE 7

[23] Efron, B., & Gong, G. (1983). A leisurely look at the boot- strap, the jackknife, and cross-validation. The American Statisti- cian, 37(1), 36-48. [24] Shorrocks, Anthony F. "Decomposition procedures for distributional analysis: a unified framework based on the Shapley value." Journal of Economic Inequality (2013): 1-28. [25] Gentle, J. E., Härdle, W. K., & Mori, Y. (Eds.). (2012). Handbook of computational statistics: concepts and methods. Springer Science & Business Media. [26] Athey, S. (2017). Beyond prediction: Using big data for policy problems. Science, 355(6324). 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884

7

885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952