Data

Gene content of seawater microbes is a strong predictor of water chemistry across the Great Barrier Reef

James Cook University
Terzin, Marko ; K. Gruber (data creator), Renee ; J. Robbins (external advisor), Steven ; C. Bell (data creator), Sara ; S. Webster (external advisor), Nicole ; Bourne, David ; Laffy, Patrick ; Yeoh, Yun Kit
Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]
ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=https://researchdata.jcu.edu.au//published/c260f760097e11f1ac7ecf6b4103ada8&rft.title=Gene content of seawater microbes is a strong predictor of water chemistry across the Great Barrier Reef&rft.identifier=https://researchdata.jcu.edu.au//published/c260f760097e11f1ac7ecf6b4103ada8&rft.publisher=Australian Institute of Marine Science (AIMS), Australia's Integrated Marine Observing System (IMOS)&rft.description= This dataset consists of paired environmental measurements and metagenomic sequence data from surface seawater collected at 48 offshore reef sites across the Great Barrier Reef (GBR) during four research trips (November 2019 – July 2020). It is a read-based metagenomic exploration contributing to the IMOS Great Barrier Reef Microbial Genomics Database (GBR-MGD), aiming to characterise the relationships between seawater microbial community composition/function and key physico-chemical water quality parameters. What the data consists of: Metagenomic sequencing data: Raw Illumina NovaSeq reads (191 samples, ~150 bp paired-end) from microbial communities in seawater pre-filtered through 5 µm and captured on 0.22 µm Sterivex filters. Physicochemical measurements: 17 variables measured for each site, including: Nutrients: Ammonium (NH₄⁺), Nitrite (NO₂⁻), Nitrate (NO₃⁻), Phosphate (PO₄³⁻), Silicate (Si), Total Dissolved Nitrogen (TDN), Total Dissolved Phosphorus (TDP), Dissolved Organic Carbon (DOC), Particulate Organic Carbon (POC), Particulate Nitrogen (PN), Particulate Phosphorus (PP). Other parameters: Temperature, Salinity, Chlorophyll a (Chl-a), Phaeophytin (Phaeo), Total Suspended Solids (TSS), and in-situ Chl-a fluorescence. Sample metadata: Site coordinates (latitude/longitude), collection date, voyage identifier (Trips 1-4), and replicate information (3-4 replicates per site for water chemistry and metagenomics, respectively). Process of analysis applied: Metagenomic processing: Raw reads were quality-checked and quality-filtered with FastQC and Trimmomatic, taxonomically and functionally annotated using DIAMOND against the NCBI nr database, and profiled in MEGAN. Outputs include genus-level taxonomic counts and Gene Ontology (GO) term counts per sample. Data normalisation: Microbial abundance data (taxonomic and functional) was center log-ratio (CLR) transformed to account for compositionality issues inherent to sequencing data. Integration & stability analysis: Multivariate INTegrative Sparse Partial Least Squares (MINT sPLS) was used to identify microbial taxa and GO terms consistently associated with physicochemical variables consistently across sampling trips. Leave-One-Group-Out Cross-Validation (LOGOCV) was applied to assign stability scores to these microbial indicators. Software/equipment used to create/collect the data: Seawater collection and filtration (metagenomics & physicochemical): Niskin bottles Divers (for some samples) Sterivex™-GP Pressure Filters (0.22 µm pore size, Millipore) Minisart® syringe prefilters (5 µm pore size, Sartorius) Peristaltic pump Research vessels: RV Solander RV Cape Ferguson (AIMS Long-Term Monitoring Program vessels) In-situ sensor data (physicochemical – underway systems): SBE 38 digital oceanographic thermometer (Sea-Bird Scientific) SBE 21 SeaCAT Thermosalinograph (Sea-Bird Scientific) ECO-FLNTU-RT fluorometer (WET Labs) — part of IMOS Ships of Opportunity Sensors Laboratory processing: Water Chemistry (physicochemical): Filtration: Whatman GF/F filters (0.7 µm), Polycarbonate filters (0.4 µm), Sartorius Minisart syringe filters (0.45 µm) Analyzers: Seal AA3 segmented flow analyzer (for NH₄⁺, NO₂⁻, NO₃⁻, PO₄³⁻, Si) Carbon/Nitrogen analyzer: Shimadzu TOC-L carbon analyzer with SSM-5000A solid sample module (POC) and TNM-L nitrogen module (PN) Fluorometer: Turner Designs 10AU fluorometer (for Chl-a and Phaeo) Digestion: Hot acid persulfate digestion (for PP analysis) Gravimetric analysis: For Total Suspended Solids (TSS) Metagenomics: DNA extraction: phenol:chloroform:isoamyl alcohol extraction with ethanol precipitation with Lysozyme (100 mg/mL) in lysis buffer NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific) Qubit 3 fluorometer (Thermo Fisher Scientific) Sequencing facility: Microba Life Sciences Ltd. (Brisbane, QLD) Sequencing platform: Illumina NovaSeq with Nextera Flex library preparation Software/equipment used to manipulate/analyse the data: Quality control & read processing: FastQC v0.11.3, Trimmomatic v0.38 Metagenomic annotation: DIAMOND alignment tool v2.0.9 (vs. NCBI nr database) Taxonomic/functional profiling: MEGAN Community Edition v6.23.0 Statistical analysis & visualization environment: R v4.3.2 / RStudio R packages: phyloseq v1.46.0 (microbiome data handling) mixOmics v6.26.0 (PCA, sPLS, MINT sPLS) vegan v2.6-4 (PERMANOVA, Mantel tests, Bray-Curtis) tidyverse v2.0.0 (data wrangling) ggplot2 v3.5.1 (graphics) pairwiseAdonis (PERMANOVA pairwise tests) microbiome v1.24.0 (CLR normalization) ggspatial, sf, raster, dataaimsr, gisaimsr (mapping) General software: Inkscape v0.92.5 (figure finalization) Analysis code repository: https://github.com/mterzin/IMOS_GBR_MGD_read-centric_analysis&rft.creator=Terzin, Marko &rft.creator=K. Gruber (data creator), Renee &rft.creator=J. Robbins (external advisor), Steven &rft.creator=C. Bell (data creator), Sara &rft.creator=S. Webster (external advisor), Nicole &rft.creator=Bourne, David &rft.creator=Laffy, Patrick &rft.creator=Yeoh, Yun Kit &rft.date=2024&rft.relation=https://link.springer.com/article/10.1186/s40168-024-01972-0#Fun&rft.coverage=east=147.29475; north=-18.62552; projection=WGS84&rft.coverage=east=147.43341; north=-18.62166; projection=WGS84&rft.coverage=east=147.55211; north=-18.73423; projection=WGS84&rft.coverage=east=147.72155; north=-18.73288; projection=WGS84&rft.coverage=east=147.5709; north=-18.6065; projection=WGS84&rft.coverage=east=147.57555; north=-18.57226; projection=WGS84&rft.coverage=east=147.71591; north=-18.65388; projection=WGS84&rft.coverage=east=147.38445; north=-18.2575; projection=WGS84&rft.coverage=east=147.07521; north=-18.61726; projection=WGS84&rft.coverage=east=146.87271; north=-18.4731; projection=WGS84&rft.coverage=east=147.05325; north=-18.47656; projection=WGS84&rft.coverage=east=147.0545; north=-18.43503; projection=WGS84&rft.coverage=east=146.98581; north=-18.42411; projection=WGS84&rft.coverage=east=146.99845; north=-18.46205; projection=WGS84&rft.coverage=east=146.56712; north=-17.81295; projection=WGS84&rft.coverage=east=146.53186; north=-17.7896; projection=WGS84&rft.coverage=east=146.38676; north=-17.57812; projection=WGS84&rft.coverage=east=146.40219; north=-17.46753; projection=WGS84&rft.coverage=east=146.47807; north=-17.20274; projection=WGS84&rft.coverage=east=146.47433; north=-17.22383; projection=WGS84&rft.coverage=east=146.23148; north=-16.8474; projection=WGS84&rft.coverage=east=146.19447; north=-16.77893; projection=WGS84&rft.coverage=east=146.1028; north=-16.63941; projection=WGS84&rft.coverage=east=146.0159; north=-16.50097; projection=WGS84&rft.coverage=east=145.86415; north=-16.04517; projection=WGS84&rft.coverage=east=151.90408; north=-23.1775; projection=WGS84&rft.coverage=east=152.50674; north=-21.86969; projection=WGS84&rft.coverage=east=152.58116; north=-21.94489; projection=WGS84&rft.coverage=east=152.65945; north=-21.99904; projection=WGS84&rft.coverage=east=152.46706; north=-21.99809; projection=WGS84&rft.coverage=east=152.31361; north=-21.96005; projection=WGS84&rft.coverage=east=151.9245; north=-23.2657; projection=WGS84&rft.coverage=east=151.77; north=-23.504; projection=WGS84&rft.coverage=east=151.72284; north=-23.53219; projection=WGS84&rft.coverage=east=152.25743; north=-23.76276; projection=WGS84&rft.coverage=east=152.2836; north=-23.80801; projection=WGS84&rft.coverage=east=152.35922; north=-23.86598; projection=WGS84&rft.coverage=east=145.84686; north=-16.07261; projection=WGS84&rft.coverage=east=144.87929; north=-14.20223; projection=WGS84&rft.coverage=east=144.09326; north=-13.8559; projection=WGS84&rft.coverage=east=144.24026; north=-13.92542; projection=WGS84&rft.coverage=east=144.44693; north=-13.96599; projection=WGS84&rft.coverage=east=143.75682; north=-12.38134; projection=WGS84&rft.coverage=east=143.8674; north=-12.3457; projection=WGS84&rft.coverage=east=143.29448; north=-11.59805; projection=WGS84&rft.coverage=east=143.32346; north=-11.35312; projection=WGS84&rft.coverage=east=143.26387; north=-11.12721; projection=WGS84&rft.coverage=east=143.24783; north=-11.0198; projection=WGS84&rft.coverage=&rft_rights=The dataset is copyright and is licensed under the Creative Commons Attribution 3.0 Australia Licence. You are free to copy, distribute, and adapt the work, provided you attribute the source as specified by the data owners.&rft_rights=CC BY-SA: Attribution-Share Alike 3.0 AU http://creativecommons.org/licenses/by-sa/3.0/au&rft_subject=Coral reefs&rft_subject=Seawater microbiome&rft_subject=Synechococcus&rft_subject=Prochlorococcus&rft_subject=Microbial loop&rft_subject=Metagenomics&rft_subject=Environmental monitoring&rft_subject=Microbial indicators&rft_subject=Great Barrier Reef&rft_subject=Microbial ecology&rft_subject=Microbiology&rft_subject=BIOLOGICAL SCIENCES&rft_subject=Microbiology not elsewhere classified&rft_subject=Ecosystem function&rft_subject=Ecological applications&rft_subject=ENVIRONMENTAL SCIENCES&rft_subject=Environmental assessment and monitoring&rft_subject=Environmental management&rft_subject=Ecological impacts of climate change and ecological adaptation&rft_subject=Climate change impacts and adaptation&rft_subject=Bioinformatics and computational biology not elsewhere classified&rft_subject=Bioinformatics and computational biology&rft_subject=Measurement and assessment of marine water quality and condition&rft_subject=Marine systems and management&rft_subject=ENVIRONMENTAL MANAGEMENT&rft_subject=Expanding knowledge in the biological sciences&rft_subject=Expanding knowledge&rft_subject=EXPANDING KNOWLEDGE&rft_subject=Expanding knowledge in the environmental sciences&rft.type=dataset&rft.language=English Access the data

Licence & Rights:

Open Licence view details
CC-BY-SA

CC BY-SA: Attribution-Share Alike 3.0 AU
http://creativecommons.org/licenses/by-sa/3.0/au

The dataset is copyright and is licensed under the Creative Commons Attribution 3.0 Australia Licence. You are free to copy, distribute, and adapt the work, provided you attribute the source as specified by the data owners.

Access:

Open view details

Open: free access under license

Full description

This dataset consists of paired environmental measurements and metagenomic sequence data from surface seawater collected at 48 offshore reef sites across the Great Barrier Reef (GBR) during four research trips (November 2019 – July 2020). It is a read-based metagenomic exploration contributing to the IMOS Great Barrier Reef Microbial Genomics Database (GBR-MGD), aiming to characterise the relationships between seawater microbial community composition/function and key physico-chemical water quality parameters.

What the data consists of:

Metagenomic sequencing data: Raw Illumina NovaSeq reads (191 samples, ~150 bp paired-end) from microbial communities in seawater pre-filtered through 5 µm and captured on 0.22 µm Sterivex filters.

Physicochemical measurements: 17 variables measured for each site, including:

  • Nutrients: Ammonium (NH₄⁺), Nitrite (NO₂⁻), Nitrate (NO₃⁻), Phosphate (PO₄³⁻), Silicate (Si), Total Dissolved Nitrogen (TDN), Total Dissolved Phosphorus (TDP), Dissolved Organic Carbon (DOC), Particulate Organic Carbon (POC), Particulate Nitrogen (PN), Particulate Phosphorus (PP).
  • Other parameters: Temperature, Salinity, Chlorophyll a (Chl-a), Phaeophytin (Phaeo), Total Suspended Solids (TSS), and in-situ Chl-a fluorescence.

Sample metadata: Site coordinates (latitude/longitude), collection date, voyage identifier (Trips 1-4), and replicate information (3-4 replicates per site for water chemistry and metagenomics, respectively).

Process of analysis applied:

  • Metagenomic processing: Raw reads were quality-checked and quality-filtered with FastQC and Trimmomatic, taxonomically and functionally annotated using DIAMOND against the NCBI nr database, and profiled in MEGAN. Outputs include genus-level taxonomic counts and Gene Ontology (GO) term counts per sample.
  • Data normalisation: Microbial abundance data (taxonomic and functional) was center log-ratio (CLR) transformed to account for compositionality issues inherent to sequencing data.
  • Integration & stability analysis: Multivariate INTegrative Sparse Partial Least Squares (MINT sPLS) was used to identify microbial taxa and GO terms consistently associated with physicochemical variables consistently across sampling trips. Leave-One-Group-Out Cross-Validation (LOGOCV) was applied to assign stability scores to these microbial indicators.

Software/equipment used to create/collect the data:

Seawater collection and filtration (metagenomics & physicochemical):

  • Niskin bottles
  • Divers (for some samples)
  • Sterivex™-GP Pressure Filters (0.22 µm pore size, Millipore)
  • Minisart® syringe prefilters (5 µm pore size, Sartorius)
  • Peristaltic pump

Research vessels:

  • RV Solander
  • RV Cape Ferguson (AIMS Long-Term Monitoring Program vessels)

In-situ sensor data (physicochemical – underway systems):

  • SBE 38 digital oceanographic thermometer (Sea-Bird Scientific)
  • SBE 21 SeaCAT Thermosalinograph (Sea-Bird Scientific)
  • ECO-FLNTU-RT fluorometer (WET Labs) — part of IMOS Ships of Opportunity Sensors

Laboratory processing:

Water Chemistry (physicochemical):

  • Filtration: Whatman GF/F filters (0.7 µm), Polycarbonate filters (0.4 µm), Sartorius Minisart syringe filters (0.45 µm)
  • Analyzers: Seal AA3 segmented flow analyzer (for NH₄⁺, NO₂⁻, NO₃⁻, PO₄³⁻, Si)
  • Carbon/Nitrogen analyzer: Shimadzu TOC-L carbon analyzer with SSM-5000A solid sample module (POC) and TNM-L nitrogen module (PN)
  • Fluorometer: Turner Designs 10AU fluorometer (for Chl-a and Phaeo)
  • Digestion: Hot acid persulfate digestion (for PP analysis)
  • Gravimetric analysis: For Total Suspended Solids (TSS)

Metagenomics:

  • DNA extraction: phenol:chloroform:isoamyl alcohol extraction with ethanol precipitation with Lysozyme (100 mg/mL) in lysis buffer
  • NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific)
  • Qubit 3 fluorometer (Thermo Fisher Scientific)

Sequencing facility: Microba Life Sciences Ltd. (Brisbane, QLD)

Sequencing platform: Illumina NovaSeq with Nextera Flex library preparation

Software/equipment used to manipulate/analyse the data:

  • Quality control & read processing: FastQC v0.11.3, Trimmomatic v0.38
  • Metagenomic annotation: DIAMOND alignment tool v2.0.9 (vs. NCBI nr database)
  • Taxonomic/functional profiling: MEGAN Community Edition v6.23.0
  • Statistical analysis & visualization environment: R v4.3.2 / RStudio

R packages:

  • phyloseq v1.46.0 (microbiome data handling)
  • mixOmics v6.26.0 (PCA, sPLS, MINT sPLS)
  • vegan v2.6-4 (PERMANOVA, Mantel tests, Bray-Curtis)
  • tidyverse v2.0.0 (data wrangling)
  • ggplot2 v3.5.1 (graphics)
  • pairwiseAdonis (PERMANOVA pairwise tests)
  • microbiome v1.24.0 (CLR normalization)
  • ggspatial, sf, raster, dataaimsr, gisaimsr (mapping)
  • General software: Inkscape v0.92.5 (figure finalization)

Analysis code repository: https://github.com/mterzin/IMOS_GBR_MGD_read-centric_analysis

Created: 2024-10-24

Data time period: 23 11 2019 to 17 07 2020

Data time period: November 2019 - July 2020

This dataset is part of a larger collection

Click to explore relationships graph

147.29475,-18.62552

147.29475,-18.62552

147.43341,-18.62166

147.43341,-18.62166

147.55211,-18.73423

147.55211,-18.73423

147.72155,-18.73288

147.72155,-18.73288

147.5709,-18.6065

147.5709,-18.6065

147.57555,-18.57226

147.57555,-18.57226

147.71591,-18.65388

147.71591,-18.65388

147.38445,-18.2575

147.38445,-18.2575

147.07521,-18.61726

147.07521,-18.61726

146.87271,-18.4731

146.87271,-18.4731

147.05325,-18.47656

147.05325,-18.47656

147.0545,-18.43503

147.0545,-18.43503

146.98581,-18.42411

146.98581,-18.42411

146.99845,-18.46205

146.99845,-18.46205

146.56712,-17.81295

146.56712,-17.81295

146.53186,-17.7896

146.53186,-17.7896

146.38676,-17.57812

146.38676,-17.57812

146.40219,-17.46753

146.40219,-17.46753

146.47807,-17.20274

146.47807,-17.20274

146.47433,-17.22383

146.47433,-17.22383

146.23148,-16.8474

146.23148,-16.8474

146.19447,-16.77893

146.19447,-16.77893

146.1028,-16.63941

146.1028,-16.63941

146.0159,-16.50097

146.0159,-16.50097

145.86415,-16.04517

145.86415,-16.04517

151.90408,-23.1775

151.90408,-23.1775

152.50674,-21.86969

152.50674,-21.86969

152.58116,-21.94489

152.58116,-21.94489

152.65945,-21.99904

152.65945,-21.99904

152.46706,-21.99809

152.46706,-21.99809

152.31361,-21.96005

152.31361,-21.96005

151.9245,-23.2657

151.9245,-23.2657

151.77,-23.504

151.77,-23.504

151.72284,-23.53219

151.72284,-23.53219

152.25743,-23.76276

152.25743,-23.76276

152.2836,-23.80801

152.2836,-23.80801

152.35922,-23.86598

152.35922,-23.86598

145.84686,-16.07261

145.84686,-16.07261

144.87929,-14.20223

144.87929,-14.20223

144.09326,-13.8559

144.09326,-13.8559

144.24026,-13.92542

144.24026,-13.92542

144.44693,-13.96599

144.44693,-13.96599

143.75682,-12.38134

143.75682,-12.38134

143.8674,-12.3457

143.8674,-12.3457

143.29448,-11.59805

143.29448,-11.59805

143.32346,-11.35312

143.32346,-11.35312

143.26387,-11.12721

143.26387,-11.12721

143.24783,-11.0198

143.24783,-11.0198

dcmiPoint: east=147.29475; north=-18.62552; projection=WGS84

dcmiPoint: east=147.43341; north=-18.62166; projection=WGS84

dcmiPoint: east=147.55211; north=-18.73423; projection=WGS84

dcmiPoint: east=147.72155; north=-18.73288; projection=WGS84

dcmiPoint: east=147.5709; north=-18.6065; projection=WGS84

dcmiPoint: east=147.57555; north=-18.57226; projection=WGS84

dcmiPoint: east=147.71591; north=-18.65388; projection=WGS84

dcmiPoint: east=147.38445; north=-18.2575; projection=WGS84

dcmiPoint: east=147.07521; north=-18.61726; projection=WGS84

dcmiPoint: east=146.87271; north=-18.4731; projection=WGS84

dcmiPoint: east=147.05325; north=-18.47656; projection=WGS84

dcmiPoint: east=147.0545; north=-18.43503; projection=WGS84

dcmiPoint: east=146.98581; north=-18.42411; projection=WGS84

dcmiPoint: east=146.99845; north=-18.46205; projection=WGS84

dcmiPoint: east=146.56712; north=-17.81295; projection=WGS84

dcmiPoint: east=146.53186; north=-17.7896; projection=WGS84

dcmiPoint: east=146.38676; north=-17.57812; projection=WGS84

dcmiPoint: east=146.40219; north=-17.46753; projection=WGS84

dcmiPoint: east=146.47807; north=-17.20274; projection=WGS84

dcmiPoint: east=146.47433; north=-17.22383; projection=WGS84

dcmiPoint: east=146.23148; north=-16.8474; projection=WGS84

dcmiPoint: east=146.19447; north=-16.77893; projection=WGS84

dcmiPoint: east=146.1028; north=-16.63941; projection=WGS84

dcmiPoint: east=146.0159; north=-16.50097; projection=WGS84

dcmiPoint: east=145.86415; north=-16.04517; projection=WGS84

dcmiPoint: east=151.90408; north=-23.1775; projection=WGS84

dcmiPoint: east=152.50674; north=-21.86969; projection=WGS84

dcmiPoint: east=152.58116; north=-21.94489; projection=WGS84

dcmiPoint: east=152.65945; north=-21.99904; projection=WGS84

dcmiPoint: east=152.46706; north=-21.99809; projection=WGS84

dcmiPoint: east=152.31361; north=-21.96005; projection=WGS84

dcmiPoint: east=151.9245; north=-23.2657; projection=WGS84

dcmiPoint: east=151.77; north=-23.504; projection=WGS84

dcmiPoint: east=151.72284; north=-23.53219; projection=WGS84

dcmiPoint: east=152.25743; north=-23.76276; projection=WGS84

dcmiPoint: east=152.2836; north=-23.80801; projection=WGS84

dcmiPoint: east=152.35922; north=-23.86598; projection=WGS84

dcmiPoint: east=145.84686; north=-16.07261; projection=WGS84

dcmiPoint: east=144.87929; north=-14.20223; projection=WGS84

dcmiPoint: east=144.09326; north=-13.8559; projection=WGS84

dcmiPoint: east=144.24026; north=-13.92542; projection=WGS84

dcmiPoint: east=144.44693; north=-13.96599; projection=WGS84

dcmiPoint: east=143.75682; north=-12.38134; projection=WGS84

dcmiPoint: east=143.8674; north=-12.3457; projection=WGS84

dcmiPoint: east=143.29448; north=-11.59805; projection=WGS84

dcmiPoint: east=143.32346; north=-11.35312; projection=WGS84

dcmiPoint: east=143.26387; north=-11.12721; projection=WGS84

dcmiPoint: east=143.24783; north=-11.0198; projection=WGS84

Identifiers
  • Local : researchdata.jcu.edu.au//published/c260f760097e11f1ac7ecf6b4103ada8