2022 Actual PDSI
2022 Predicted PDSI
2023 Actual PDSI
2023 Predicted PDSI
2024-2050 Predicted PDSI
Result
We predict the U.S. Palmer Drought Severity Index (PDSI) from 2024 to 2050 using the Scikit-Learn library. Our model indicates that the southern-western regions are at an increased risk of drought.

Years - Training Data Period: 2012 - 2023

Years – Prediction Period: 2024 - 2050

Prediction Model: Support Vector Machines (sklearn)

Country: U.S.A.

Number of Climate Regions: 355

Endorsed by
This needs updates to CMS. Plan to create a new CMS for claims, then pull claim items from there. This will allow you to create a new multi-field reference inside article CMS (limit 5). This is kind of stupid, revisit CMS limits to see if can avoid this.
Contributing Authors
Micropub

This study introduces a machine learning-driven approach to predicting drought severity across the United States, leveraging the Scikit-Learn library with a particular focus on the Support Vector Machine (SVM) model. SVM was selected after cross-validation testing, where it outperformed other machine learning models. Data for this analysis was acquired through Google Earth Engine, drawing from five distinct datasets, which were then processed and transformed to mitigate skewness and ensure optimal predictive performance. The climate data aggregated monthly and regionally (355 climate regions in total), spans from January 2010 to December 2023, with additional climate projections extending from January 2024 to December 2050.

Key predictive features included minimum and maximum temperature, precipitation, minimum and maximum humidity, and wind speed. To enhance context, 3-, 6-, and 12-month average values for these variables were calculated across each geographic region. Data from 2011 to 2021 was designated for training, while 2022 and 2023 were reserved for model testing, ensuring a robust validation process.

Model performance was primarily assessed using Root Mean Square Error (RMSE), with SVM achieving the lowest RMSE of approximately 1.82 (Note that the range of the target values is 17.1.). Testing across the entire test dataset yielded an RMSE of around 1.13, which decreased to approximately 1.06 following hyperparameter optimization. the SVM model reached an RMSE as low as 0.71 on the test dataset, underscoring its strong fit to recent data.

Through feature importance analysis, adjustments to feature selection were made, improving the model’s performance. These included the removal of less informative features and the addition of 18- and 24-month average climate data, which together reduced RMSE to 0.97. Tests with Principal Component Analysis (PCA) were conducted to assess potential benefits of dimensionality reduction, though PCA was not found to enhance model accuracy.

A final visual assessment was performed by mapping actual and predicted Palmer Drought Severity Index (PDSI) values across the United States for 2022 and 2023 (Figures 1 to 4). Additionally, the predicted PDSI values from 2024 to 2050 were animated (Figure 5) to enhance engagement and illustrate future drought trends dynamically.

These results underscore the potential of machine learning in environmental monitoring, offering an accessible and scalable tool for drought forecasting. This approach has practical implications for proactive water resource management and policy-making, aiding decision makers in making informed decisions to mitigate future drought impacts.

References
  1. Géron, Aurélien. Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O'Reilly Media, 2019 (LINK NEEDED)
  2. https://cartographyvectors.com/map/1290-us-climate-regions
  3. https://developers.google.com/earth-engine/tutorials/community/time-series-visualization-with-altair
  4. https://www.kaggle.com/code/akshayasrinivasan2/drought-prediction-using-ml-algorithms
Protocols
  1. Regression analysis with Support Vector Machines (LINK NEEDED)
  2. Visualization on US map (LINK NEEDED)
  3. Source code available under the GitHub repository (https://github.com/mertalpaydin/USA-Drought-Projection-with-Scikit-Learn)
Data sets
Permanently stored in the library at arweave.org