Full description
This dataset consists of paired environmental measurements and metagenomic sequence data from surface seawater collected at 48 offshore reef sites across the Great Barrier Reef (GBR) during four research trips (November 2019 – July 2020). It is a read-based metagenomic exploration contributing to the IMOS Great Barrier Reef Microbial Genomics Database (GBR-MGD), aiming to characterise the relationships between seawater microbial community composition/function and key physico-chemical water quality parameters.
What the data consists of:
Metagenomic sequencing data: Raw Illumina NovaSeq reads (191 samples, ~150 bp paired-end) from microbial communities in seawater pre-filtered through 5 µm and captured on 0.22 µm Sterivex filters.
Physicochemical measurements: 17 variables measured for each site, including:
- Nutrients: Ammonium (NH₄⁺), Nitrite (NO₂⁻), Nitrate (NO₃⁻), Phosphate (PO₄³⁻), Silicate (Si), Total Dissolved Nitrogen (TDN), Total Dissolved Phosphorus (TDP), Dissolved Organic Carbon (DOC), Particulate Organic Carbon (POC), Particulate Nitrogen (PN), Particulate Phosphorus (PP).
- Other parameters: Temperature, Salinity, Chlorophyll a (Chl-a), Phaeophytin (Phaeo), Total Suspended Solids (TSS), and in-situ Chl-a fluorescence.
Sample metadata: Site coordinates (latitude/longitude), collection date, voyage identifier (Trips 1-4), and replicate information (3-4 replicates per site for water chemistry and metagenomics, respectively).
Process of analysis applied:
- Metagenomic processing: Raw reads were quality-checked and quality-filtered with FastQC and Trimmomatic, taxonomically and functionally annotated using DIAMOND against the NCBI nr database, and profiled in MEGAN. Outputs include genus-level taxonomic counts and Gene Ontology (GO) term counts per sample.
- Data normalisation: Microbial abundance data (taxonomic and functional) was center log-ratio (CLR) transformed to account for compositionality issues inherent to sequencing data.
- Integration & stability analysis: Multivariate INTegrative Sparse Partial Least Squares (MINT sPLS) was used to identify microbial taxa and GO terms consistently associated with physicochemical variables consistently across sampling trips. Leave-One-Group-Out Cross-Validation (LOGOCV) was applied to assign stability scores to these microbial indicators.
Software/equipment used to create/collect the data:
Seawater collection and filtration (metagenomics & physicochemical):
- Niskin bottles
- Divers (for some samples)
- Sterivex™-GP Pressure Filters (0.22 µm pore size, Millipore)
- Minisart® syringe prefilters (5 µm pore size, Sartorius)
- Peristaltic pump
Research vessels:
- RV Solander
- RV Cape Ferguson (AIMS Long-Term Monitoring Program vessels)
In-situ sensor data (physicochemical – underway systems):
- SBE 38 digital oceanographic thermometer (Sea-Bird Scientific)
- SBE 21 SeaCAT Thermosalinograph (Sea-Bird Scientific)
- ECO-FLNTU-RT fluorometer (WET Labs) — part of IMOS Ships of Opportunity Sensors
Laboratory processing:
Water Chemistry (physicochemical):
- Filtration: Whatman GF/F filters (0.7 µm), Polycarbonate filters (0.4 µm), Sartorius Minisart syringe filters (0.45 µm)
- Analyzers: Seal AA3 segmented flow analyzer (for NH₄⁺, NO₂⁻, NO₃⁻, PO₄³⁻, Si)
- Carbon/Nitrogen analyzer: Shimadzu TOC-L carbon analyzer with SSM-5000A solid sample module (POC) and TNM-L nitrogen module (PN)
- Fluorometer: Turner Designs 10AU fluorometer (for Chl-a and Phaeo)
- Digestion: Hot acid persulfate digestion (for PP analysis)
- Gravimetric analysis: For Total Suspended Solids (TSS)
Metagenomics:
- DNA extraction: phenol:chloroform:isoamyl alcohol extraction with ethanol precipitation with Lysozyme (100 mg/mL) in lysis buffer
- NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific)
- Qubit 3 fluorometer (Thermo Fisher Scientific)
Sequencing facility: Microba Life Sciences Ltd. (Brisbane, QLD)
Sequencing platform: Illumina NovaSeq with Nextera Flex library preparation
Software/equipment used to manipulate/analyse the data:
- Quality control & read processing: FastQC v0.11.3, Trimmomatic v0.38
- Metagenomic annotation: DIAMOND alignment tool v2.0.9 (vs. NCBI nr database)
- Taxonomic/functional profiling: MEGAN Community Edition v6.23.0
- Statistical analysis & visualization environment: R v4.3.2 / RStudio
R packages:
- phyloseq v1.46.0 (microbiome data handling)
- mixOmics v6.26.0 (PCA, sPLS, MINT sPLS)
- vegan v2.6-4 (PERMANOVA, Mantel tests, Bray-Curtis)
- tidyverse v2.0.0 (data wrangling)
- ggplot2 v3.5.1 (graphics)
- pairwiseAdonis (PERMANOVA pairwise tests)
- microbiome v1.24.0 (CLR normalization)
- ggspatial, sf, raster, dataaimsr, gisaimsr (mapping)
- General software: Inkscape v0.92.5 (figure finalization)
Analysis code repository: https://github.com/mterzin/IMOS_GBR_MGD_read-centric_analysis
Created: 2024-10-24
Data time period: 23 11 2019 to 17 07 2020
Data time period:
November 2019 - July 2020
dcmiPoint: east=147.29475; north=-18.62552; projection=WGS84
dcmiPoint: east=147.43341; north=-18.62166; projection=WGS84
dcmiPoint: east=147.55211; north=-18.73423; projection=WGS84
dcmiPoint: east=147.72155; north=-18.73288; projection=WGS84
dcmiPoint: east=147.5709; north=-18.6065; projection=WGS84
dcmiPoint: east=147.57555; north=-18.57226; projection=WGS84
dcmiPoint: east=147.71591; north=-18.65388; projection=WGS84
dcmiPoint: east=147.38445; north=-18.2575; projection=WGS84
dcmiPoint: east=147.07521; north=-18.61726; projection=WGS84
dcmiPoint: east=146.87271; north=-18.4731; projection=WGS84
dcmiPoint: east=147.05325; north=-18.47656; projection=WGS84
dcmiPoint: east=147.0545; north=-18.43503; projection=WGS84
dcmiPoint: east=146.98581; north=-18.42411; projection=WGS84
dcmiPoint: east=146.99845; north=-18.46205; projection=WGS84
dcmiPoint: east=146.56712; north=-17.81295; projection=WGS84
dcmiPoint: east=146.53186; north=-17.7896; projection=WGS84
dcmiPoint: east=146.38676; north=-17.57812; projection=WGS84
dcmiPoint: east=146.40219; north=-17.46753; projection=WGS84
dcmiPoint: east=146.47807; north=-17.20274; projection=WGS84
dcmiPoint: east=146.47433; north=-17.22383; projection=WGS84
dcmiPoint: east=146.23148; north=-16.8474; projection=WGS84
dcmiPoint: east=146.19447; north=-16.77893; projection=WGS84
dcmiPoint: east=146.1028; north=-16.63941; projection=WGS84
dcmiPoint: east=146.0159; north=-16.50097; projection=WGS84
dcmiPoint: east=145.86415; north=-16.04517; projection=WGS84
dcmiPoint: east=151.90408; north=-23.1775; projection=WGS84
dcmiPoint: east=152.50674; north=-21.86969; projection=WGS84
dcmiPoint: east=152.58116; north=-21.94489; projection=WGS84
dcmiPoint: east=152.65945; north=-21.99904; projection=WGS84
dcmiPoint: east=152.46706; north=-21.99809; projection=WGS84
dcmiPoint: east=152.31361; north=-21.96005; projection=WGS84
dcmiPoint: east=151.9245; north=-23.2657; projection=WGS84
dcmiPoint: east=151.77; north=-23.504; projection=WGS84
dcmiPoint: east=151.72284; north=-23.53219; projection=WGS84
dcmiPoint: east=152.25743; north=-23.76276; projection=WGS84
dcmiPoint: east=152.2836; north=-23.80801; projection=WGS84
dcmiPoint: east=152.35922; north=-23.86598; projection=WGS84
dcmiPoint: east=145.84686; north=-16.07261; projection=WGS84
dcmiPoint: east=144.87929; north=-14.20223; projection=WGS84
dcmiPoint: east=144.09326; north=-13.8559; projection=WGS84
dcmiPoint: east=144.24026; north=-13.92542; projection=WGS84
dcmiPoint: east=144.44693; north=-13.96599; projection=WGS84
dcmiPoint: east=143.75682; north=-12.38134; projection=WGS84
dcmiPoint: east=143.8674; north=-12.3457; projection=WGS84
dcmiPoint: east=143.29448; north=-11.59805; projection=WGS84
dcmiPoint: east=143.32346; north=-11.35312; projection=WGS84
dcmiPoint: east=143.26387; north=-11.12721; projection=WGS84
dcmiPoint: east=143.24783; north=-11.0198; projection=WGS84
User Contributed Tags
Login to tag this record with meaningful keywords to make it easier to discover
- Local : researchdata.jcu.edu.au//published/c260f760097e11f1ac7ecf6b4103ada8
