Background: “Zero Hunger” is the second Sustainable Development Goal (SDG) of the United Nations (UN) [1]. One indicator for this SDG Goal is the PoU, defined [2] as an estimate of the proportion (%) of the population whose habitual food consumption is insufficient to provide the dietary energy levels that are required to maintain a normal active and healthy life. In a 2021 report on world hunger, the Food & Agriculture Organization (FAO) pinpoints three major factors contributing to PoU - conflict, economic shocks and weather extremes [3]. In this work, we collect data on these factors to generate yearly country-level PoU forecasts.
Results: The PoU data for different years is not independent and identically distributed. A five-fold time series split was used to cross validate the findings. A range of different models were exploited and it was found that random forest regressor performed the best [Figure 3] with an R²-value of 0.80.
The final dataset had 18 years (X = 2001-18; y = 2002-19) worth of independent variable data for 155 countries [Figure 4]. With the random forest regressor model, predictions were made with a root mean squared error of 5.65 and a R²-value of 0.78 [Figure 5]. However there was an observed overfitting (bias) as the R²-value on the training data amounted to 0.98.
Constructing dataset:
Different datasets are combined to construct a new, unique dataset that is tailored to this specific forecasting problem.
For conflicts, casualties corresponding to events of organized violence [Data 1] were considered. Only the recorded incidents are included, and they may vary from estimates of real total casualties, especially for wars. For missing fields, 0 casualties were assumed for a given country and year.
For weather, the total precipitation per year, the average temperature [Data 2] and the Normalized Difference Vegetation Index (NDVI) [Data 3] were considered. NDVI [4] is an indicator for vegetation density. The yearly temperature and precipitation data is not sensitive to seasonal variations.
For economic data the Gross Domestic Product (GDP), the Gross National Income (GNI) and the Food Production Index (FPI) were considered [Data 4]. FPI and the GDP were excluded from the final model. GDP has a correlation of 1 to GNI [Figure 1], however, the ‘feature importance’ of the model classified it as much less relevant for the output. FPI was excluded as it ranked lowest [Figure 2].
Future work: Our work can be extended in various directions:- 1) granular datasets - days/months instead of years; 2) population, drought, flood data as variables; and 3) stronger neural network architectures.