Himawari-8-derived diurnal variations in ground-level PM2.5 pollution across China using the fast space-time Light Gradient Boosting Machine (LightGBM)

Wei, J., Z. Li, R.T. Pinker, J. Wang, L. Sun, W. Xue, R. Li, and M. Cribb (2021), Himawari-8-derived diurnal variations in ground-level PM2.5 pollution across China using the fast space-time Light Gradient Boosting Machine (LightGBM), Atmos. Chem. Phys., 21, 7863-7880, doi:10.5194/acp-21-7863-2021.
Abstract

Fine particulate matter with a diameter of less than 2.5 µm (PM2.5 ) has been used as an important atmospheric environmental parameter mainly because of its impact on human health. PM2.5 is affected by both natural and anthropogenic factors that usually have strong diurnal variations. Such information helps toward understanding the causes of air pollution, as well as our adaptation to it. Most existing PM2.5 products have been derived from polarorbiting satellites. This study exploits the use of the nextgeneration geostationary meteorological satellite Himawari8/AHI (Advanced Himawari Imager) to document the diurnal variation in PM2.5 . Given the huge volume of satellite data, based on the idea of gradient boosting, a highly efficient tree-based Light Gradient Boosting Machine (LightGBM) method by involving the spatiotemporal characteristics of air pollution, namely the space-time LightGBM (STLG) model, is developed. An hourly PM2.5 dataset for China (i.e., ChinaHighPM2.5 ) at a 5 km spatial resolution is derived based on Himawari-8/AHI aerosol products with additional environmental variables. Hourly PM2.5 estimates (number of data samples = 1 415 188) are well correlated with ground measurements in China (cross-validation coefficient of determination, CV-R 2 = 0.85), with a root-meansquare error (RMSE) and mean absolute error (MAE) of 13.62 and 8.49 µg m−3 , respectively. Our model captures well the PM2.5 diurnal variations showing that pollution increases gradually in the morning, reaching a peak at about 10:00 LT (GMT+8), then decreases steadily until sunset. The proposed approach outperforms most traditional statistical regression and tree-based machine-learning models with a much lower computational burden in terms of speed and memory, making it most suitable for routine pollution monitoring.

PDF of Publication
Download from publisher's website
Research Program
Applied Sciences Program (ASP)
Modeling Analysis and Prediction Program (MAP)
Atmospheric Composition
Atmospheric Composition Modeling and Analysis Program (ACMAP)