five

Linking satellites to genes with machine learning to estimate phytoplankton community structure from space

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10361484
下载链接
链接失效反馈
官方服务:
资源简介:
General description The datasets presented in this repository have served in the development of a new ocean color algorithm to derive the relative cell abundance of seven phytoplankton groups (output of algorithm #1, called SOMRCA), as well as their contribution to total chlorophyll a (ChlaPG, output of algorithm #2, called SOMChlF) at the global scale using an omic-based marker: psbO. The outputs of the algorithm SOMChlF were compared to the HPLC-based definition of phytoplankton groups. All the details about this study are found in El Hourany, R., Pierella Karlusich, J., Zinger, L., Loisel, H., Levy, M., and Bowler, C.: Linking satellites to genes with machine learning to estimate phytoplankton community structure from space, Ocean Sci., 20, 217–239, https://doi.org/10.5194/os-20-217-2024, 2024. In "Tara_Oceans_psbO_dataset_Final.xlsx", it can be found the Tara Oceans' psbO metagenomic counts converted into relative cell abundance and Chlorophyll-a contribution for seven phytoplankton groups alongside satellite matchups. This dataset was used for algorithm development. In "Assets HPLC_SOMChlF.xlsx", the HPLC database was used as a comparison with Satellite-derived ChlaPG. Each data document presents a description sheet. In the following, the datasets used in this study are described. Tara Oceans psbO metagenomic abundancesThe psbO gene is a single-copy gene in most eukaryotes and prokaryotes. We used psbO reads from the metagenomes generated by the Tara Oceans expedition as a proxy for phytoplankton relative cell abundance (see more details in Pierella Karlusich et al., 2023 Mol Ecol Res; https://doi.org/10.1111/1755-0998.13592). Among the 210 Tara Oceans stations, 145 stations sampled metagenomes in different ocean regimes from oligotrophic to eutrophic waters (Chl a from 0.01 to 10 mg m−3, median at 0.3 mg m−3) from 2009 to 2013. Seawater samples were filtered to differentiate five planktonic size fractions (0.22–3, 0.8–5, 5–20, 20–180, 180–2000 µm). We retrieved the psbO read abundances from each Tara Oceans size-fractionated seawater sample from the Supplementary Material from Pierella Karlusich et al., 2023 Mol Ecol Res (https://www.ebi.ac.uk/biostudies/files/S-BSST761/psbO_mapping_against_Tara_Oceans_metagenomes.tsv). We used the psbO data to taxonomically differentiate seven phytoplankton groups: diatoms, dinoflagellates, green algae, haptophytes, pelagophytes, cryptophytes, and prokaryotes (cyanobacteria). The psbO read abundances of these seven groups are expressed as relative phytoplankton cell abundance (%) and their contribution to the Chlorophyll-a (Chla PG, in mg m-3). Phytoplankton that were not assigned to any of these seven groups (unclassified) represented less than 5 % of the total relative cell abundance among all size classes. To obtain a single value of relative cell abundance per station, we pooled the five size fractions into a single aggregated sample. For Chl a content estimation, we used a conversion via size-dependent weights (see formula 1 in El Hourany et al., 2024). There are two levels of information derived from the molecular dataset: relative abundance of psbO reads as a proxy for relative cell abundance and the fraction of Chl a that each group represents. Both types of information have different implications. Chl a is often used as a proxy for biomass, which is a relevant parameter for energy and matter fluxes (e.g., food webs, biogeochemical cycles). At the same time, cell abundance corresponds to species abundance for unicellular organisms, which is an important measure for inferring community assembly processes. Satellite MatchupsWe used ocean color products from the GlobColour project (R2019, full archive reprocessed, 2020) to retrieve satellite matchups for the psbO-derived abundances. These products were constructed by merging data from various satellite sensors: Sea-viewing Wide Field-of-view Sensor (SeaWiFS), Moderate Resolution Imaging Spectroradiometer (MODIS), Visible Infrared Imaging Radiometer Suite (VIIRS), Medium Resolution Imaging Spectrometer (MERIS), and Ocean and Land Colour Instrument (OLCI). We used 16 GlobColour products as inputs to retrieve the phytoplankton community structure: chlorophyll a concentration (Chl a, product name: CHL1-AVW), remote sensing reflectances (Rrs) at 11 wavelengths (412, 443, 469, 490, 510, 531, 547, 555, 620, 645, and 670 nm), light attenuation coefficient at 490 nm (Kd490), photosynthetically available radiation (PAR), normalized fluorescence light height (NFLH), and particulate backscattering at 443 nm (bbp). These products have daily and 4 km spatiotemporal resolution. In addition, we used the Climate Change Initiative (CCI) sea surface temperature (SST) product at 4 km resolution and daily frequency distributed by the Copernicus Marine Services (CMEMS) portal. HPLC datasetsTo compare satellite-derived phytoplankton group Chla fractions' distribution (outputs of the algorithm named SOMChlF) with more conventional DPA-based products, we compiled a global HPLC dataset regrouping 12 000 HPLC observations from several HPLC datasets between 1997 and 2014. This HPLC dataset was collocated with the SOMChlF-based ChlaPG. This dataset depicts the abundance of the pigments most widely used to identify major phytoplankton groups: fucoxanthin (Fuco), peridinin (Perid), alloxanthin (Allo), zeaxanthin (Zea), chlorophyll b (Chl b), 19-hexanoyloxyfucoxanthin (19HF), and 19-butanoyloxyfucoxanthin (19BF). Diagnostic pigments were used to estimate the Chl a fraction for each phytoplankton group, namely diatoms, dinoflagellates, haptophytes, green algae, cryptophytes, pelgophytes, and prokaryotes. The Chl a fraction per group is expressed by HPLC-based ChlaPG = Chla in-situ · DP · α / Sum (DP · α) where "α" is a coefficient associated with a diagnostic pigment (DP) for a specific PG. All the details are found in El Hourany, R., Pierella Karlusich, J., Zinger, L., Loisel, H., Levy, M., and Bowler, C.: Linking satellites to genes with machine learning to estimate phytoplankton community structure from space, Ocean Sci., 20, 217–239, https://doi.org/10.5194/os-20-217-2024, 2024.
创建时间:
2024-07-25
二维码
社区交流群
二维码
科研交流群
商业服务