Full description
225 Mangifera indica WGS VCF files Plant material: We selected 225 Mangifera indica from the gene pool collection of the Australian Mango Breeding Program managed by the Department of Agriculture and Fisheries at Walkamin Research Station, Queensland (17.1341°S, 145.4271°E). These samples were originally imported from 24 countries across five geographical regions and grafted onto the uniform polyembryonic rootstock, Kensington Pride. The samples with one parent known are from an open-pollination cross (Dataset S1). Genotyping: We extracted DNA from these 225 mango accessions growing at Walkamin Research Station. The DNA was extracted from young leaves following Healey et al. (2014) CTAB (cetyltrimethylammonium bromide) method with the following modifications. 5 g of young leaves were ground using a mortar and pestle. 16 mL nuclear lysis buffer and 4 mL Sarkosyl solution were added and gently mixed with the leaf tissue. Tubes were incubated for 2 h at 65°C in a water bath with periodic mixing. The RNAse digest (step 5) was moved to day 2 and allowed to incubate at room temperature for 1 h. 5 M NaCl was added to the final concentration of 0.25 M and mixed well. 0.35 vol 100% EtOH was added and quickly mixed. The samples were incubated on ice for 10 mins and centrifuged for 15 mins at 10°C at 9,000 rpm. The solution was transferred to a new tube and an equal volume of chloroform was added and gently inverted 50 times. Samples were again centrifuged for 15 mins at 10°C at 9,000 rpm. The upper phase of the solution was transferred to a new tube. One volume of isopropanol was added, and the tube was inverted to mix and centrifuged for 15 mins at 10°C at 13,000 rpm. The solution was removed and 500 μl 70% EtOH was added and centrifuged for 15 mins at 10°C at 13,000 rpm. The solution was removed, and the dry pellet was resuspended in TE buffer. DNA quality and quantity was assessed using qubit, nanodrop, and a 1% agarose gel. For the nanodrop 260/280, samples were between 1.7 - 2 and 260/230 > 1.4. We performed whole genome sequencing on these 225 mango samples at the Ramaciotti Centre for Genomics, UNSW, New South Wales, Australia. Sequencing libraries were generated using the Illumina DNA-prep kit and subjected to 150bp PE sequencing on one S4 flow cell of a NovaSeq 6000 Illumina sequencer. Sequencing was undertaken to obtain a minimum data depth for each of the samples; 41 mango samples with an expected coverage of 40X and 184 mango samples with an expected coverage of 15X. The size of the raw data provided by the Ramaciotti was more than requested. Alignments and variant calling: To obtain SNPs for downstream analyses we joint-called SNPs using the GATK4 software package and best practices developed by the Broad Institute for variant detection. The publicly available GATK pipeline (https://gencore.bio.nyu.edu/variant-calling-pipeline-gatk4/) was heavily modified to include functionality for read processing, deduplication, quality control, parallel processing and joint-calling (Fig. S2). These modifications ensured high quality mapping in repetitive regions like centromeres. Trimmed paired-end reads and singletons from 225 re-sequenced M. indica genomes (25x average coverage, 100x maximum coverage) were aligned to the M. indica cv. ‘Alphonso’ reference genome (NCBI GenBank assembly accession: GCA_011075055.1) (Wang et al., 2020a) using BWA MEM v0.7.17 (Li, 2013) with ‘-v 3 -Y -K 100000000 -M’ parameters. This approach produced 44,125,383 SNPs (1 every 9 bases). Quality filtering: The data was quality filtered using the following parameters in VCFtools v0.1.17 (Danecek et al., 2011): minimum genotype quality of 20; minimum depth per sample of 5; missing data per site 50% (initial relaxed threshold); maximum mean depth of 50 for all sites (removal of paralogues); and a missing data per site 20% (stringent threshold). All accessions have more than 85% of SNPs. We then used GATK VariantFiltration for quality filtering which removed: QualByDepth < 2; quality score < 30; StrandOddsRatio > 3; FisherStrand > 60; RMSMappingQuality < 40; MappingQualityRankSumTest < -12.5; ReadPosRankSum < -8. Indels were then removed using VCFtools v0.1.17(Danecek et al., 2011) and contigs that did not align to a chromosome were removed using BCFtools v1.12 (Danecek et al., 2021). All code is available on GitHub (https://github.com/Melanie-Wilkinson/MangoWGS).Issued: 28 10 2024
Subjects
Agricultural, Veterinary and Food Sciences |
Fruit |
Horticultural Crop Improvement (Incl. Selection and Breeding) |
Horticultural Production |
Multi-tissue plant structure |
Phyllome |
eng |
User Contributed Tags
Login to tag this record with meaningful keywords to make it easier to discover
Other Information
Centromeres are hotspots for chromosomal inversions and breeding traits in mango
local : UQ:e7f18a6
Wilkinson, Melanie J., McLay, Kathleen, Kainer, David, Elphinstone, Cassandra, Dillon, Natalie L., Webb, Matthew, Wijesundara, Upendra K., Ali, Asjad, Bally, Ian S. E., Munyengwa, Norman, Furtado, Agnelo, Henry, Robert J., Hardner, Craig M. and Ortiz‐Barrientos, Daniel (2024). Centromeres are hotspots for chromosomal inversions and breeding traits in mango. New Phytologist, 245 (2), 899-913. doi: 10.1111/nph.20252
Research Data Collections
local : UQ:289097
Identifiers
- Local : RDM ID: c28a32e6-b688-4afe-a6ef-9ce29e7da472
- DOI : 10.48610/1819A7B