Data

Data from: Conservation prioritisation of genomic diversity to inform management of a declining mammal species

Name: Data from: Conservation prioritisation of genomic diversity to inform management of a declining mammal species
Published: 2024

The University of Western Australia

von Takach, Brenton ; Cameron, Skye F. ; Cremona, Teigan ; Eldridge, Mark D. B. ; Fisher, Diana O. ; Hohnen, Rosemary ; Jolly, Chris J. ; Kelly, Ella ; Phillips, Ben L. ; Radford, Ian J. ; Rick, Kate ; Spencer, Peter B. S. ; Trewella, Gavin J. ; Umbrello, Linette S. ; Banks, Sam C.

Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]

ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=info:doi10.5281/zenodo.10633248&rft.title=Data from: Conservation prioritisation of genomic diversity to inform management of a declining mammal species&rft.identifier=10.5281/zenodo.10633248&rft.publisher=Zenodo&rft.description=In our present age of extinction, conservation managers must use limited resources efficiently to conserve species and the genetic diversity within them. To conserve intraspecific variation, we must understand the geographic distribution of the variation and plan management actions that will cost-effectively maximise its retention. Here, we use a genome-wide single-nucleotide polymorphism (SNP) dataset consisting of 12,962 loci and 384 individuals to inform conservation management of the Endangered northern quoll (Dasyurus hallucatus), a carnivorous marsupial distributed patchily across northern Australia. Many northern quoll populations have declined or are currently declining, driven by the range-expanding cane toad (Rhinella marina). We (1) confirm population genomic structure, (2) investigate the contribution of each population to overall diversity, (3) conduct genomic prioritisation analyses at several spatial and hierarchical scales using popular conservation planning algorithms, and (4) investigate patterns of inbreeding. We find that the conservation of a single population, or even several populations, will not prevent the loss of substantial amounts of genomic variation and adaptive capacity. Rather, the conservation of at least eight populations from across the species distribution is necessary to retain 90% of SNP alleles. We also show that more geographically isolated populations, such as those on islands, have very small contributions to overall diversity and show relatively high levels of inbreeding compared to mainland populations. Our study highlights the importance of conserving multiple genetically distinct populations to effectively conserve genetic diversity in species undergoing widespread declines, and demonstrates the importance of using multiple criteria to inform and prioritise conservation management. Funding provided by: Australian Research CouncilCrossref Funder Registry ID: https://ror.org/05mmh0f86Award Number: Funding provided by: Bioplatforms AustraliaCrossref Funder Registry ID: http://dx.doi.org/10.13039/501100023560Award Number: In total, we processed 568 samples (including 36 technical replicates) for DNA extraction and sequencing. Tissue samples were obtained from researchers who were working across the entire distribution range of the northern quoll. Tissue collection spanned nearly three decades, from 1993 to 2021, and included the regional jurisdictions of Queensland, the Northern Territory, and Western Australia. To capture live animals, various trapping designs tailored to local conditions and research aims were used, but generally 10–20 cage traps were arranged in small grids (e.g. 1 ha) at intervals of about 1 km along roadsides or small vehicle tracks. In most cases, our samples consisted of small (2 mm diameter) ear tissue samples collected from live individuals. Individuals were sexed by visual inspection during processing, prior to release. Due to low capture rates of the target species in many areas, we assigned a single a priori population name (sampling locality) to all samples obtained from a given set of grids, even if the samples were collected over several years. Samples were typically stored in 70–100% ethanol, either in a freezer or at room temperature, until being sent to us for analysis. Tissue samples were extracted in plate format using the standard protocol of the Qiagen DNeasy 96 Blood & Tissue Kit, with an extended lysis. This involved incubation at 56°C for 2 h, followed by a reduction in temperature to 37°C overnight. Following extraction, double-stranded DNA concentrations were quantified and normalised to 200 ng DNA in a total volume of 25 µL. These samples were then arranged in 96-well plates for double-digest restriction-associated DNA (ddRAD) sequencing at the Australian Genome Research Facility in Melbourne, Victoria. Each plate included several within-plate and among-plate technical replicates (for a total of 36 technical replicates) and a negative control (blank). To determine the optimal combination of two restriction enzymes for ddRAD sequencing, three establishment samples (broadly representative of the species distribution) were used. PstI and NlaIII were deemed the most suitable enzymes for achieving the best genome representation while minimising repetitive sequences. The library preparation protocol included (1) digestion using PstI and NlaIII, (2) ligation with one of 48 unique inline barcoded adapters compatible with the restriction site overhang, (3) manual sample pooling, (4) DNA purification using the QIAquick PCR Purification Kit and SPRIselect paramagnetic beads, (5) size-selection targeting fragments of 280–375 bp in size using the BluePippin from Sage Science, and (6) a PCR amplification step where one of two multiplexing index primers was added (von Takach et al., 2022b). After indexing, libraries were pooled together and loaded onto flow cells for 150 bp single-end or paired-end sequencing (with only single-end reads used for analysis). Sequencing was performed on either an Illumina NextSeq 500 (three plates) or a NovaSeq 6000 (three plates) platform. Our bioinformatics pipeline used a combination of tools and custom scripts to analyse sequencing data. Raw sequence data (2.271 billion reads) were first processed using the stacks process_radtags function for demultiplexing (Catchen et al., 2013), retaining 95.8% (2.176 billion) of reads. The demultiplexed files were then mapped to the published northern quoll chromosome-length genome assembly (https://www.dnazoo.org/assemblies/Dasyurus_hallucatus) using the bwa version 0.7.17 mem algorithm (Li, 2013), generating sequence alignment map (SAM) files for each sample (individuals and technical replicates), resulting in 2.25 billion alignments. The SAM files were compressed to binary alignment map (BAM) files, and then each BAM file was filtered for unmapped alignments (retaining 98.8% = 2.223 billion alignments), then sorted and indexed using samtools v1.7-1 (Li et al., 2009). The filtered and sorted BAM files were then used to call single-nucleotide polymorphisms (SNPs) via the angsd (v0.93) software package (Korneliussen et al., 2014). The following filters were applied in angsd: minimum mapping quality of 20 (excluding reads that mapped poorly or mapped to repeat regions of the genome), minimum base quality of 20, minimum call rate of 0.5 (across samples), minimum depth per site of 1250, maximum depth per site of 53,200, minimum depth per individual of 6, maximum depth per individual of 100, allele balance ratio of 0.2, SNP likelihood p value ≤ 1×10-5, and genotype posterior probability ≥ 0.98 (based on GATK genotype likelihood with a uniform prior). This retained 486,083 SNPs that were genotyped in ≥ 50% of samples. All subsequent filters and analyses were conducted in a custom R (v4.3.1) (R Core Team, 2023) script that retained loci with < 5% missing data across samples and a minor allele count ≥ 3, as well as loci with observed heterozygosity < 0.6 (to exclude potential erroneously merged reads). Samples with > 35% missing data across SNPs were also excluded, to remove poorly genotyped individuals. To account for bias resulting from linkage disequilibrium (LD), we used the 'SNPRelate' package to prune SNPs in LD (Zheng et al., 2012), setting the LD threshold to 0.5 and the sliding window size to 500k bp. The gl.filter.sexlinked function in the 'dartR' package (Gruber et al., 2018; Mijangos et al., 2022) was then used to check for sex-linked SNPs. We also applied the filter.sex.linked function of Robledo-Ruiz et al. (2022), which checks for Y-linked loci, sex-biased loci, X-linked loci, and XY gametologs. No sex-linked loci were identified. We then checked similarity between pairs of technical replicates, finding a mean similarity of 99.97%. Close pairing of technical replicates was also checked visually with a hierarchical clustering dendrogram (Supplementary Material Figure S1), after which we removed one individual from each pair of technical replicates, and one individual from each pair of close relatives (relatedness ≥ 0.25), with relatedness calculated using the method-of-moments technique in the 'beta.dosage' function of the hierfstat package (Goudet, 2005; Goudet et al., 2018). This approach estimates kinship values between pairs of individuals relative to the average kinship values of all pairs of individuals in the sampled population (i.e., within each locality). The final dataset contained 12,962 SNPs and 384 individuals.&rft.creator=von Takach, Brenton &rft.creator=Cameron, Skye F. &rft.creator=Cremona, Teigan &rft.creator=Eldridge, Mark D. B. &rft.creator=Fisher, Diana O. &rft.creator=Hohnen, Rosemary &rft.creator=Jolly, Chris J. &rft.creator=Kelly, Ella &rft.creator=Phillips, Ben L. &rft.creator=Radford, Ian J. &rft.creator=Rick, Kate &rft.creator=Spencer, Peter B. S. &rft.creator=Trewella, Gavin J. &rft.creator=Umbrello, Linette S. &rft.creator=Banks, Sam C. &rft.date=2024&rft.relation=http://research-repository.uwa.edu.au/en/publications/ed22d101-c2fe-41cd-a0ad-5bec4f235c5a&rft.type=dataset&rft.language=English Access the data

Access data via landing page
http://doi.org/10.5281...

Cite Saved to MyRDA Save to MyRDA

Access:

Open

Full description

In our present age of extinction, conservation managers must use limited resources efficiently to conserve species and the genetic diversity within them. To conserve intraspecific variation, we must understand the geographic distribution of the variation and plan management actions that will cost-effectively maximise its retention. Here, we use a genome-wide single-nucleotide polymorphism (SNP) dataset consisting of 12,962 loci and 384 individuals to inform conservation management of the Endangered northern quoll (Dasyurus hallucatus), a carnivorous marsupial distributed patchily across northern Australia. Many northern quoll populations have declined or are currently declining, driven by the range-expanding cane toad (Rhinella marina). We (1) confirm population genomic structure, (2) investigate the contribution of each population to overall diversity, (3) conduct genomic prioritisation analyses at several spatial and hierarchical scales using popular conservation planning algorithms, and (4) investigate patterns of inbreeding. We find that the conservation of a single population, or even several populations, will not prevent the loss of substantial amounts of genomic variation and adaptive capacity. Rather, the conservation of at least eight populations from across the species distribution is necessary to retain 90% of SNP alleles. We also show that more geographically isolated populations, such as those on islands, have very small contributions to overall diversity and show relatively high levels of inbreeding compared to mainland populations. Our study highlights the importance of conserving multiple genetically distinct populations to effectively conserve genetic diversity in species undergoing widespread declines, and demonstrates the importance of using multiple criteria to inform and prioritise conservation management. Funding provided by: Australian Research CouncilCrossref Funder Registry ID: https://ror.org/05mmh0f86Award Number: Funding provided by: Bioplatforms AustraliaCrossref Funder Registry ID: http://dx.doi.org/10.13039/501100023560Award Number: In total, we processed 568 samples (including 36 technical replicates) for DNA extraction and sequencing. Tissue samples were obtained from researchers who were working across the entire distribution range of the northern quoll. Tissue collection spanned nearly three decades, from 1993 to 2021, and included the regional jurisdictions of Queensland, the Northern Territory, and Western Australia. To capture live animals, various trapping designs tailored to local conditions and research aims were used, but generally 10–20 cage traps were arranged in small grids (e.g. 1 ha) at intervals of about 1 km along roadsides or small vehicle tracks. In most cases, our samples consisted of small (2 mm diameter) ear tissue samples collected from live individuals. Individuals were sexed by visual inspection during processing, prior to release. Due to low capture rates of the target species in many areas, we assigned a single a priori population name (sampling locality) to all samples obtained from a given set of grids, even if the samples were collected over several years. Samples were typically stored in 70–100% ethanol, either in a freezer or at room temperature, until being sent to us for analysis. Tissue samples were extracted in plate format using the standard protocol of the Qiagen DNeasy 96 Blood & Tissue Kit, with an extended lysis. This involved incubation at 56°C for 2 h, followed by a reduction in temperature to 37°C overnight. Following extraction, double-stranded DNA concentrations were quantified and normalised to 200 ng DNA in a total volume of 25 µL. These samples were then arranged in 96-well plates for double-digest restriction-associated DNA (ddRAD) sequencing at the Australian Genome Research Facility in Melbourne, Victoria. Each plate included several within-plate and among-plate technical replicates (for a total of 36 technical replicates) and a negative control (blank). To determine the optimal combination of two restriction enzymes for ddRAD sequencing, three establishment samples (broadly representative of the species distribution) were used. PstI and NlaIII were deemed the most suitable enzymes for achieving the best genome representation while minimising repetitive sequences. The library preparation protocol included (1) digestion using PstI and NlaIII, (2) ligation with one of 48 unique inline barcoded adapters compatible with the restriction site overhang, (3) manual sample pooling, (4) DNA purification using the QIAquick PCR Purification Kit and SPRIselect paramagnetic beads, (5) size-selection targeting fragments of 280–375 bp in size using the BluePippin from Sage Science, and (6) a PCR amplification step where one of two multiplexing index primers was added (von Takach et al., 2022b). After indexing, libraries were pooled together and loaded onto flow cells for 150 bp single-end or paired-end sequencing (with only single-end reads used for analysis). Sequencing was performed on either an Illumina NextSeq 500 (three plates) or a NovaSeq 6000 (three plates) platform. Our bioinformatics pipeline used a combination of tools and custom scripts to analyse sequencing data. Raw sequence data (2.271 billion reads) were first processed using the stacks process_radtags function for demultiplexing (Catchen et al., 2013), retaining 95.8% (2.176 billion) of reads. The demultiplexed files were then mapped to the published northern quoll chromosome-length genome assembly (https://www.dnazoo.org/assemblies/Dasyurus_hallucatus) using the bwa version 0.7.17 mem algorithm (Li, 2013), generating sequence alignment map (SAM) files for each sample (individuals and technical replicates), resulting in 2.25 billion alignments. The SAM files were compressed to binary alignment map (BAM) files, and then each BAM file was filtered for unmapped alignments (retaining 98.8% = 2.223 billion alignments), then sorted and indexed using samtools v1.7-1 (Li et al., 2009). The filtered and sorted BAM files were then used to call single-nucleotide polymorphisms (SNPs) via the angsd (v0.93) software package (Korneliussen et al., 2014). The following filters were applied in angsd: minimum mapping quality of 20 (excluding reads that mapped poorly or mapped to repeat regions of the genome), minimum base quality of 20, minimum call rate of 0.5 (across samples), minimum depth per site of 1250, maximum depth per site of 53,200, minimum depth per individual of 6, maximum depth per individual of 100, allele balance ratio of 0.2, SNP likelihood p value ≤ 1×10-5, and genotype posterior probability ≥ 0.98 (based on GATK genotype likelihood with a uniform prior). This retained 486,083 SNPs that were genotyped in ≥ 50% of samples. All subsequent filters and analyses were conducted in a custom R (v4.3.1) (R Core Team, 2023) script that retained loci with < 5% missing data across samples and a minor allele count ≥ 3, as well as loci with observed heterozygosity < 0.6 (to exclude potential erroneously merged reads). Samples with > 35% missing data across SNPs were also excluded, to remove poorly genotyped individuals. To account for bias resulting from linkage disequilibrium (LD), we used the 'SNPRelate' package to prune SNPs in LD (Zheng et al., 2012), setting the LD threshold to 0.5 and the sliding window size to 500k bp. The gl.filter.sexlinked function in the 'dartR' package (Gruber et al., 2018; Mijangos et al., 2022) was then used to check for sex-linked SNPs. We also applied the filter.sex.linked function of Robledo-Ruiz et al. (2022), which checks for Y-linked loci, sex-biased loci, X-linked loci, and XY gametologs. No sex-linked loci were identified. We then checked similarity between pairs of technical replicates, finding a mean similarity of 99.97%. Close pairing of technical replicates was also checked visually with a hierarchical clustering dendrogram (Supplementary Material Figure S1), after which we removed one individual from each pair of technical replicates, and one individual from each pair of close relatives (relatedness ≥ 0.25), with relatedness calculated using the method-of-moments technique in the 'beta.dosage' function of the hierfstat package (Goudet, 2005; Goudet et al., 2018). This approach estimates kinship values between pairs of individuals relative to the average kinship values of all pairs of individuals in the sampled population (i.e., within each locality). The final dataset contained 12,962 SNPs and 384 individuals.

Notes

External Organisations
Curtin University; Australian Wildlife Conservancy; Charles Darwin University; Australian Museum; Queensland University of Technology; Macquarie University; University of Melbourne; Department of Biodiversity, Conservation and Attractions (Western Australia); Murdoch University; Western Australian Museum

Associated Persons
Ben L. Phillips (Contributor); Linette S. Umbrello (Contributor)Brenton von Takach (Contributor); Skye F. Cameron (Contributor); Teigan Cremona (Contributor); Mark D. B. Eldridge (Contributor); Diana O. Fisher (Contributor); Rosemary Hohnen (Contributor); Chris J. Jolly (Contributor); Ella Kelly (Contributor); Ian J. Radford (Contributor); Peter B. S. Spencer (Contributor); Gavin J. Trewella (Contributor); Sam C. Banks (Contributor)

Issued: 2024-02-23

This dataset is part of a larger collection

Click to explore relationships graph

User Contributed Tags

Identifiers

DOI : 10.5281/ZENODO.10633248
global : 55dd78e3-c29b-4603-9b6f-a8910e57a7f8

Data from: Conservation prioritisation of genomic diversity to inform management of a declining mammal species

Access:

Full description

Notes

This dataset is part of a larger collection

Related Publications

Related Data

Related Organisations

Related People

User Contributed Tags