Full description
This dataset consists of prokaryotic metagenome-assembled genomes (pMAGs) and paired environmental data from 48 offshore GBR reefs, used to compare microbial communities between No-Take Marine Reserves (NTMRs) and fished zones across the Great Barrier Reef Marine Park (GBRMP). The data was collected to assess how marine zoning influences seawater microbiome structure, function, and ecological networks in relation to reef health indicators.
What the data consists of:
Metagenomic assemblies: 5,283 prokaryotic metagenome-assembled genomes (pMAGs) from hybrid (Illumina-Nanopore) and short-read assemblies, dereplicated at 95% Average Nucleotide Identity (ANI) sequence similarity to 876 “species-resolved” pMAGs.
Environmental variables: 54 continuous variables including:
- Physicochemical variables: Ammonium (NH₄⁺), Nitrite (NO₂⁻), Nitrate (NO₃⁻), Phosphate (PO₄³⁻), Silicate (Si), Total Dissolved Nitrogen (TDN), Total Dissolved Phosphorus (TDP), Dissolved Organic Carbon (DOC), Particulate Organic Carbon (POC), Particulate Nitrogen (PN), Particulate Phosphorus (PP), Temperature, Salinity, Chlorophyll a (Chl-a), Phaeophytin (Phaeo), Total Suspended Solids (TSS), and in-situ Chl-a fluorescence.
-
Benthic cover:
- Substrate: Abiotic, Sand, Unknown
- Algae: crustose coralline algae, macroalgae, turf algae
- Hard corals: Acropora groups (bottlebrush, branching, digitate, encrusting, submassive, tabulate) and non-Acropora morphology: (branching, encrusting, foliose, massive, submassive), and other corals including mushroom coral, and a fire coral Millepora (hydrozoan).
- Soft corals: arborescent, encrusting, arborescent and encrusting, capitate, lobate, massive
- other biota: sponges, zoanthids, and other organisms
-
Fish data:
- Overall fish biomass
- abundances collapsed at trophic groups: carnivore, corallivore, detritivore, herbivore, invertivore, piscivore
Sample metadata: Reef zoning status (NTMR vs. fished), geographic sector, sampling date, and replicate information.
Process of analysis applied:
- Genome assembly & annotation: Hybrid assembly with Aviary pipeline, taxonomic classification with GTDB-Tk, functional annotation with anvi’o and KEGG modules.
- Indicator analysis: Multivariate INTegrative Sparse Partial Least Squares Discriminant Analysis (MINT sPLS-DA) to identify microbial MAGs indicative of reef zoning. Multivariate INTegrative Sparse Partial Least Squares (MINT sPLS) to correlate microbial abundance data with continuous environmental variables.
- Network analysis: Co-occurrence network connectedness and cohesion metrics to compare positive/negative microbial interactions between zones. Regression of microbial genomic traits (genome size, GC content, and metabolic pathway completeness scores) against microbial network properties.
- Predictive modeling: Random Forest models to predict environmental variables from microbial abundances; microbial niche modeling (RO method) to infer tolerance ranges.
- GLMMs to test environmental differences (physicochemical, benthic cover, and fish abundance/biomass variables) between zones.
Software/equipment used to create/collect the data:
Seawater collection and filtration (metagenomics & physicochemical):
- Niskin bottles
- Divers (for some samples)
- Sterivex™-GP Pressure Filters (0.22 µm pore size, Millipore)
- Minisart® syringe prefilters (5 µm pore size, Sartorius)
- Peristaltic pump
Research vessels:
- RV Solander
- RV Cape Ferguson (AIMS Long-Term Monitoring Program vessels)
In-situ sensor data as part of IMOS Ships of Opportunity Sensors underway systems:
- SBE 38 digital oceanographic thermometer (Sea-Bird Scientific)
- SBE 21 SeaCAT Thermosalinograph (Sea-Bird Scientific)
- ECO-FLNTU-RT fluorometer (WET Labs)
Standardised AIMS Long-Term Monitoring Program (LTMP) protocols:
-
Benthic cover assessment (in-situ reef surveys):
- SCUBA surveys along permanently marked transects (5 × 50 m per site)
- Digital underwater cameras for benthic imagery
- Fixed-point (quincunx pattern) digital overlay for substrate classification
- Specialised point image classifier software for benthic cover analysis
-
Fish abundance and biomass estimation (in-situ reef surveys):
- Visual belt transect surveys:
- 5 m width for large-bodied mobile fishes (250 m² area)
- 1 m width for small-bodied fishes (50 m² area)
- Species identification and size estimation using predefined size bins
- FishBase-derived length-weight relationships for fish biomass calculation
- Visual belt transect surveys:
Laboratory processing:
-
Water Chemistry (physicochemical):
- Filtration: Whatman GF/F filters (0.7 µm), Polycarbonate filters (0.4 µm), Sartorius Minisart syringe filters (0.45 µm)
- Analyzers: Seal AA3 segmented flow analyzer (for NH₄⁺, NO₂⁻, NO₃⁻, PO₄³⁻, Si)
- Carbon/Nitrogen analyzer: Shimadzu TOC-L carbon analyzer with SSM-5000A solid sample module (POC) and TNM-L nitrogen module (PN)
- Fluorometer: Turner Designs 10AU fluorometer (for Chl-a and Phaeo)
- Digestion: Hot acid persulfate digestion (for PP analysis)
- Gravimetric analysis: For Total Suspended Solids (TSS)
-
Hybrid metagenomics:
- DNA extraction: phenol:chloroform:isoamyl alcohol extraction with ethanol precipitation with Lysozyme (100 mg/mL) in lysis buffer
- NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific)
- Qubit 3 fluorometer (Thermo Fisher Scientific)
- Sequencing facility: Microba Life Sciences Ltd. (Brisbane, QLD)
- Sequencing platform: Illumina NovaSeq with Nextera Flex library preparation
- Long-read sequencing (subset of samples) with Oxford Nanopore PromethION R9.4 chemistry
Software/equipment used to manipulate/analyse the data:
Metagenome assembly, binning and taxonomic classification:
- Long-read processing: Guppy v5.0.16 (superaccuracy basecalling), Porechop (adapter/barcode trimming)
- Assembly pipeline: Aviary v0.3.3 for hybrid (Illumina-Nanopore) and short-read assemblies
- Binning & dereplication: CoverM v0.6 for 95% ANI dereplication
- Taxonomic classification: Genome Taxonomy Database Toolkit (GTDB-Tk, release R214)
- Read mapping: minimap2 v2.18 (via CoverM) for abundance estimation
- Functional annotation and metabolic profiling using gene prediction & annotation: anvi’o v8 with Prodigal v2.6.3 (ORF prediction), HMMER v3.3.2, and based on the KEGG Orthology (KO) functional database via anvi-run-kegg-kofams
- Metabolic pathway analysis: anvi-estimate-metabolism for KEGG module completeness
Environmental Data processing:
- Benthic cover analysis: Specialised point image classifier software for digital image analysis
- Fish biomass calculation: FishBase-derived length-weight relationships, custom R scripts for biomass aggregation
- Physicochemical data was obtained from the AIMS Water Quality team
Microbiome and environmental data integration (Analysis environment: R v4.3.2 / RStudio):
- Microbiome data handling: phyloseq v1.46.0, microbiome v1.24.0 (CLR transformation)
- Multivariate analysis: mixOmics v6.26.0 (PCA, sPLS-DA, MINT sPLS-DA, MINT sPLS)
- Generalized linear mixed models: glmmTMB v1.1.10, DHARMa v0.4.7 for diagnostics
- Machine learning: randomForest v4.7-1.2 for predictive modeling of environmental metrics from microbiomes
- Co-occurrence networks: Custom R implementation of connectedness/cohesion metrics (following methods of Hernandez et al. 2021 and Herren and McMahon 2017)
- Microbial niche modeling: Robust optimum (RO) method for computing niche tolerance ranges (Q1, Q2, Q3)
- Network modularity: Calculated from correlation-based co-occurrence networks
Data wrangling & visualization: tidyverse v2.0.0, ggplot2 v3.5.1 Spatial mapping: ggspatial, sf, raster, dataaimsr, gisaimsr General software for figure finalization: Inkscape v0.92.5
Analysis code availability: https://github.com/mterzin/fishy_microbes
Created: 2025-09-25
Data time period: 23 11 2019 to 17 07 2020
Data time period:
November 2019 - July 2020
dcmiPoint: east=147.29475; north=-18.62552; projection=WGS84
dcmiPoint: east=147.43341; north=-18.62166; projection=WGS84
dcmiPoint: east=147.55211; north=-18.73423; projection=WGS84
dcmiPoint: east=147.72155; north=-18.73288; projection=WGS84
dcmiPoint: east=147.5709; north=-18.6065; projection=WGS84
dcmiPoint: east=147.57555; north=-18.57226; projection=WGS84
dcmiPoint: east=147.71591; north=-18.65388; projection=WGS84
dcmiPoint: east=147.38445; north=-18.2575; projection=WGS84
dcmiPoint: east=147.07521; north=-18.61726; projection=WGS84
dcmiPoint: east=146.87271; north=-18.4731; projection=WGS84
dcmiPoint: east=147.05325; north=-18.47656; projection=WGS84
dcmiPoint: east=147.0545; north=-18.43503; projection=WGS84
dcmiPoint: east=146.98581; north=-18.42411; projection=WGS84
dcmiPoint: east=146.99845; north=-18.46205; projection=WGS84
dcmiPoint: east=146.56712; north=-17.81295; projection=WGS84
dcmiPoint: east=146.53186; north=-17.7896; projection=WGS84
dcmiPoint: east=146.38676; north=-17.57812; projection=WGS84
dcmiPoint: east=146.40219; north=-17.46753; projection=WGS84
dcmiPoint: east=146.47807; north=-17.20274; projection=WGS84
dcmiPoint: east=146.47433; north=-17.22383; projection=WGS84
dcmiPoint: east=146.23148; north=-16.8474; projection=WGS84
dcmiPoint: east=146.19447; north=-16.77893; projection=WGS84
dcmiPoint: east=146.1028; north=-16.63941; projection=WGS84
dcmiPoint: east=146.0159; north=-16.50097; projection=WGS84
dcmiPoint: east=145.86415; north=-16.04517; projection=WGS84
dcmiPoint: east=151.90408; north=-23.1775; projection=WGS84
dcmiPoint: east=152.50674; north=-21.86969; projection=WGS84
dcmiPoint: east=152.58116; north=-21.94489; projection=WGS84
dcmiPoint: east=152.65945; north=-21.99904; projection=WGS84
dcmiPoint: east=152.46706; north=-21.99809; projection=WGS84
dcmiPoint: east=152.31361; north=-21.96005; projection=WGS84
dcmiPoint: east=151.9245; north=-23.2657; projection=WGS84
dcmiPoint: east=151.77; north=-23.504; projection=WGS84
dcmiPoint: east=151.72284; north=-23.53219; projection=WGS84
dcmiPoint: east=152.25743; north=-23.76276; projection=WGS84
dcmiPoint: east=152.2836; north=-23.80801; projection=WGS84
dcmiPoint: east=152.35922; north=-23.86598; projection=WGS84
dcmiPoint: east=145.84686; north=-16.07261; projection=WGS84
dcmiPoint: east=144.87929; north=-14.20223; projection=WGS84
dcmiPoint: east=144.09326; north=-13.8559; projection=WGS84
dcmiPoint: east=144.24026; north=-13.92542; projection=WGS84
dcmiPoint: east=144.44693; north=-13.96599; projection=WGS84
dcmiPoint: east=143.75682; north=-12.38134; projection=WGS84
dcmiPoint: east=143.8674; north=-12.3457; projection=WGS84
dcmiPoint: east=143.29448; north=-11.59805; projection=WGS84
dcmiPoint: east=143.32346; north=-11.35312; projection=WGS84
dcmiPoint: east=143.26387; north=-11.12721; projection=WGS84
dcmiPoint: east=143.24783; north=-11.0198; projection=WGS84
User Contributed Tags
Login to tag this record with meaningful keywords to make it easier to discover
- Local : researchdata.jcu.edu.au//published/7136fca0098711f1ac7ecf6b4103ada8
