satellIte phytoplaNkton Drivers In the Global Ocean over 1998-2015 (INDIGO Benchmark dataset)

Date 2022-11
Temporal extent 1998-01-01 -2015-12-31
Author(s) Roussillon Joana1, Fablet RonanORCID2, Gorgues Thomas1, Drumetz LucasORCID2, Littaye Jean1, Martinez ElodieORCID1
Affiliation(s) 1 : UMR6523 Laboratoire d'Oceanographie Physique et Spatiale (LOPS), France
2 : IMT Atlantique, Lab-STICC, UMR CNRS 6285, France
DOI 10.17882/91910
Publisher SEANOE
Keyword(s) phytoplankton physical drivers, satellite ocean color, time-series regression, global scale, deep learning, benchmark
Abstract

This benchmark dataset contains the physical data used as predictors to reconstruct global chlorophyll-a concentrations (Chl, a proxy of phytoplankton biomass) in Roussillon et al., as well as the reference satellite Chl target fields. The nine physical predictors' data (Short-Wave radiations, Sea Surface Temperature, Sea Level Anomaly, Zonal and meridional surface currents, Zonal and meridional surface wind stress, Bathymetry, Binary continental mask) were extracted from publicly available datasets over [1998-2015] and resampled to the same spatio-temporel resolution as Chl, i.e. monthly on a 1°x1° grid between 50°N and 50°S. Missing values were gap-filled using the heat diffusion equation. Each variable was normalized by substracting its mean from the original values and dividing by its standard deviation over [1998-2015].

This dataset was used to train and validate the Multi-Mode Convolutional Neural network (CNNMM8) introduced in Roussillon et al. ; reconstructed monthly Chl fields over the [2012-2015] test period are also provided here.

We hope this benchmark dataset can help to promote the improvements of methods as well as the emergence of new ideas, as building datasets is sometimes more time-consuming than the implementation of machine learning tools themselves. This would also facilitate the quantitative comparison of models performances' on the exact same datasets.

Licence CC-BY
Data
File Size Format Processing Access
Normalized physical input data over 1998-2015 945 MB NumPy array Processed data Open access
Reference target satellite Chl over 1998-2015 105 MB NumPy array Processed data Open access
Reconstructed Chl over 2012-2015 test period 23 MB NumPy array Processed data Open access
Top of the page

How to cite 

Roussillon Joana, Fablet Ronan, Gorgues Thomas, Drumetz Lucas, Littaye Jean, Martinez Elodie (2022). satellIte phytoplaNkton Drivers In the Global Ocean over 1998-2015 (INDIGO Benchmark dataset). SEANOE. https://doi.org/10.17882/91910