Mission Impossible? Exploring the Promise of Multiple Imputation for Predicting Missing GPS-Based Land Area Measures in Household Surveys

WB Working Paper

Issued on

Volume number


Research has provided robust evidence for the use of GPS technology to be the scalable gold standard in land area measurement in household surveys. Nonetheless, facing budget constraints, survey agencies often seek to measure with GPS only plots within a given radius of dwelling locations. Subsequently, it is common for significant shares of plots not to be measured, and research has highlighted the selection biases resulting from using incomplete data. This study relies on nationally-representative, multi-topic household survey data from Malawi and Ethiopia that exhibit near-negligible missingness in GPS-based plot areas, and validates the accuracy of a multiple imputation model for predicting missing GPS-based plot areas in household surveys. The analysis (i) randomly creates missingness among plots beyond two operationally relevant distance measures from the dwelling locations; (ii) conducts multiple imputation under each distance scenario for each artificially created data set; and (iii) compares the distributions of the imputed plot-level outcomes, namely, area and agricultural productivity, with the known distributions. In Malawi, multiple imputation can produce imputed yields that are statistically undistinguishable from the true distributions with up to 82 percent missingness in plot areas that are further than 1 kilometer from the dwelling location. The comparable figure in Ethiopia is 56 percent. These rates correspond to overall rates of missingness of 23 percent in Malawi and 13 percent in Ethiopia. The study highlights the promise of multiple imputation for reliably predicting missing GPS-based plot areas, and provides recommendations for optimizing fieldwork activities to capture the minimum required data.


  • Talip Kilic
  • Ismael Yacoubou Djima
  • Calogero Carletto