HyperAIHyperAI

Command Palette

Search for a command to run...

A Team From the Hong Kong University of Science and Technology Proposed a Spatiotemporal Interpolation and Prediction Model for Global Water Health Diagnosis, Enabling Accurate Prediction of the Spatiotemporal Distribution of Chlorophyll a Along Coastal areas.

Featured Image

Nutrient input from land and active hydrodynamics make the coastal waters one of the most productive marine ecosystems in the world, but also bring potential risks such as serious eutrophication and hypoxia.Predicting the spatiotemporal distribution of chlorophyll a (Chl_a) is an important means to diagnose the health status of coastal ecosystems.

However, existing tools are still insufficient to support analytical approaches based on predicting the spatiotemporal distribution of chlorophyll a.Among them, traditional hydrodynamic-biogeochemical coupling methods have difficulty analyzing nutrient transfer in marine ecosystems, and factors such as energy flux and biomass are difficult to incorporate into the calculation; data-driven prediction methods are prone to accumulate errors in the long-term integration process of nonlinear systems.

Against this backdrop, a research team from the Hong Kong University of Science and Technology developed an artificial intelligence-driven spatiotemporal interpolation and prediction (STIMP) model to predict chlorophyll a in the coastal ocean. The STIMP model solves the problems caused by incomplete data, temporal non-stationary changes and spatial heterogeneity by integrating specially designed modules, providing a new paradigm for predicting marine chlorophyll a under temporal and spatial constraints.

The relevant research results were published in Nature Communications under the title "Spatiotemporal Imputation and Prediction Model".

Research highlights:

* Developed the STIMP model and proposed a two-stage "interpolation + prediction" architecture to effectively alleviate the problems of high missing measurement rates and loss of spatiotemporal patterns, and quantify forecast uncertainty;

* Integrates the spatiotemporal denoising diffusion model (STDDM), the temporal linear transformer (TLT), and the heterogeneous spatial graph neural network (HSGNN) to address the three major challenges of incomplete data, non-stationary temporal changes, and spatial heterogeneity;

*Empirical studies were completed in four typical areas: the Pearl River Estuary, the Yangtze River Estuary, the northern Gulf of Mexico, and the Chesapeake Bay, verifying the global effectiveness of the STIMP model's ability to predict the spatiotemporal distribution of chlorophyll a.

Paper address:

https://go.hyper.ai/BjOR5

Follow the official account and reply "chlorophyll coastal distribution" to get the full PDF

MODIS Chlorophyll a Measured Dataset

This study used a measured chlorophyll-a dataset from the waters off Hong Kong and a remote sensing reflectance dataset from the Sunflower satellite to construct chlorophyll-a inversion models at three different depths. The MODIS Chl-a data used in the study have been publicly released through the Moderate Resolution Imaging Spectroradiometer (MODIS) Aqua project, and processed MODIS Chl-a data are available on Zenodo.

Zenodo website:

https://doi.org/10.5281/zenodo.14638405

STIMP method based on deep learning

The research team used a deep learning-based method to input chlorophyll a observations of coastal oceans and a spatial map containing the geographic coordinates of the observations into the STIMP architecture to obtain a complete chlorophyll a dataset, thereby accurately estimating and predicting chlorophyll a in coastal oceans.

Chlorophyll a observation information in representative coastal areas around the world

Two-stage architecture of the STIMP model

STIMP decomposes the prediction of chlorophyll a into two consecutive steps: interpolation and prediction.During the interpolation process, the study used spatiotemporal embedding modules to simultaneously capture spatial structure and temporal dynamics, reconstructing multiple potential complete spatiotemporal chlorophyll a distributions from partial observations; in the prediction step, STIMP will use Rubin's rules based on the reconstructed spatiotemporal distribution of chlorophyll a to obtain the final chlorophyll a prediction value by averaging the results of multiple interpolation and prediction processes.

Overview of the Two-Phase Architecture of the STIMP Model

Three core integrated modules of the STIMP model

The rapid development of satellite remote sensing observations provides opportunities for developing data-driven, large-scale, spatiotemporal chlorophyll-a prediction methods. However, this also presents challenges such as data incompleteness, non-stationary temporal variation, and spatial heterogeneity. To address this, the STIMP model has designed three core integrated modules to achieve spatiotemporal chlorophyll-a prediction:

Spatiotemporal denoising diffusion modelSpatiotemporal Denoising Diffusion Model (STDDM): Applied to the interpolation function, STDDM reconstructs the complete spatiotemporal distribution under high missingness during the interpolation phase. This module decomposes complex tasks into simpler ones, gradually improving the signal-to-noise ratio to achieve the transition from incomplete observations to complete data.

*  Time Linear TransformerTemporal Linear Transformer (TLT): It is used to capture non-stationary temporal variation patterns. It calculates the dependencies of the entire time series through a self-attention mechanism, retaining key information about the dynamic changes of chlorophyll a. It then calculates all elements of the time series to facilitate understanding of the temporal pattern of chlorophyll a.

Heterogeneous Spatial Graph Neural NetworksHeterogeneous Spatial Graph Neural Network (HSGNN): This approach addresses spatial heterogeneity and utilizes a parameter pool to generate location-specific parameters, thereby ensuring sensitivity to regional differences in different geographical environments.

These three integrated modules ensure that the STIMP model can achieve robust estimation and prediction in the interpolation and prediction stages in the face of incomplete data, complex temporal dynamics and significant spatial differences.

Validation of STIMP performance

STIMP's spatiotemporal interpolation performance

This study demonstrates the effectiveness of the STIMP model for spatiotemporal interpolation, using the Pearl River Estuary as an example. The researchers selected observational data from the entire Pearl River Estuary from February 7, 2015, to February 2, 2016, and reconstructed chlorophyll a distribution using STIMP and baseline methods including the Data Interpolating Empirical Orthogonal Function (DINEOF), Masked Autoencoder (MaskedAE), and Linear Interpolation (Lin-ITP).

Experiments show that when the average missing measurement rate in the Pearl River Estuary reaches 50.29%, the STIMP model reduces the mean absolute error (MAE) by 45.90% to 77.35% compared with DINEOF in a one-year interpolation task, and further reduces it by 10.20% to 40.38% compared with the second-best model. STIMP effectively preserves spatial relationships during the interpolation process, producing larger values near the coastline and similar values across most areas. Even when the missing data rate is high, STIMP can effectively reconstruct complete data.

Measured and estimated chlorophyll-a distribution in the Pearl River Estuary

Furthermore, STIMP effectively preserves temporal relationships during interpolation. When interpolating five single locations from February 7, 2015, to September 22, 2022, STIMP incorporates more fluctuations than simple linear interpolation.

Chlorophyll a estimated by STIMP at five sites

The study also validated STIMP's effectiveness in coastal oceans worldwide. In the Yangtze River Estuary, STIMP's MAE decreased by 68.311 TP3T to 90.921 TP3T compared to DINEOF, and by 15.621 TP3T to 42.671 TP3T compared to the next-best AI method. In the northern Gulf of Mexico, STIMP's MAE decreased by 69.421 TP3T to 74.881 TP3T compared to DINEOF; and in the Chesapeake Bay, STIMP's MAE decreased by 62.081 TP3T to 75.631 TP3T compared to DINEOF.Overall, STIMP can maintain stable performance under different missing rate conditions and can still reconstruct the true spatiotemporal structure under high missing rate.

Spatiotemporal prediction performance of STIMP

The researchers also verified STIMP's superior long-term forecasting performance through forecasting experiments. Compared to the baseline method, STIMP's mean absolute error (MAE) for one-year forecasts decreased by 6.541 TP3T to 13.681 TP3T, for two-year forecasts by 13.681 TP3T to 32.251 TP3T, and for three-year forecasts by 13.771 TP3T to 32.011 TP3T, outperforming other forecasting methods.

MAE performance of STIMP and baseline models in 1-year, 2-year, and 3-year forecasts

also,After filling the data, STIMP's prediction of the distribution is significantly improved.In areas with a high rate of missing data, the prediction results of STIMP tend to be more improved than those of PredRNN, which proves that filling in data before prediction helps STIMP effectively capture the spatial distribution and seasonal signals of chlorophyll a.

MAE of actual values and PredRNN, STIMP without interpolation, and STIMP predicted values
Relationship between performance improvement brought by interpolation and missing data rate

Taking the Pearl River Estuary as an example, STIMP significantly improves its forecast MAE compared to the numerical model CMOMS and the deep learning method PredRNN. For one-year forecasts, the MAE is reduced by 6.541 TP3T to 13.681 TP3T, for two-year forecasts by 13.681 TP3T, and for three-year forecasts by 13.771 TP3T to 32.011 TP3T. At individual locations, STIMP achieves MAE improvements of up to 53.781 TP3T compared to CMOMS, from 74.631 TP3T, and by 1.831 TP3T compared to PredRNN, from 30.281 TP3T.

In the Yangtze River Estuary, the northern Gulf of Mexico, and the Chesapeake Bay, the overall prediction performance of STIMP also showed significant improvement compared with the PredRNN method, and it can better maintain the periodicity of the data.Overall, STIMP demonstrates the effectiveness and robustness of its two-stage architecture in dealing with incomplete spatiotemporal observation data.

"AI+Ocean" cross-disciplinary research and team

The research interests of the team led by Yang Can and Gan Jianping from the Hong Kong University of Science and Technology span multiple fields including mathematics, statistics, artificial intelligence and physical oceanography.

Among them, Yang Can, Professor of Mathematics at the Hong Kong University of Science and Technology and Deputy Director of the Big Data Biointelligence Laboratory, is dedicated to innovative research in statistical learning and artificial intelligence methodologies, focusing on the application of cutting-edge methods such as deep learning, generative models, and graph neural networks to modeling and predicting high-dimensional complex data. In recent years, he has expanded his research focus to marine science and public health, actively promoting interdisciplinary research in "AI + Ocean." Previously, Yang Can's team developed the BOOST/GBOOST accelerated GWAS analysis tools, proposed a multi-phenotype risk prediction method for LEP, and designed the VGrow generative framework to facilitate the translation of genetic data into facial features for non-European and American populations.

Gan Jianping, Chair Professor and Head of the Department of Ocean Science at the Hong Kong University of Science and Technology, has long been committed to studying the circulation dynamics of nearshore and shelf oceans and their coupling processes with ecosystems, focusing on research areas such as coastal ecological health, pollution control, and regional climate sustainable development. In the field of physical oceanography, Gan Jianping's team has developed the WavyOcean 2.0 regional ocean digital twin platform, which can integrate ocean process simulation, GIS, BIM, and digital twin technologies to achieve three-dimensional coupled modeling of the ocean-land-atmosphere system, supporting dynamic visualization and interactive analysis of ocean flows, biogeochemical evolution, precipitation, and pollution diffusion, covering the Greater Bay Area and the coast of China. Through field observations and model simulations, the team revealed for the first time that the South China Sea region has a bi-layer alternately rotating circulation structure, correcting the structural deviations of previous ocean models.

References:

1.https://pubs.acs.org/doi/10.1021/cr300014x

Get high-quality papers and in-depth interpretation articles in the field of AI4S from 2023 to 2024 with one click⬇️