Data

Remediation of petroleum contaminants in the Antarctic and subantarctic - pyrosequencing genomic DNA extracts from soil

Australian Antarctic Data Centre
SNAPE, IAN ; SICILIANO, STEVEN ; LAGEREWSKIJ, GREG
Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]
ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=https://data.aad.gov.au/metadata/records/ASAC_1163_pyrosequencing&rft.title=Remediation of petroleum contaminants in the Antarctic and subantarctic - pyrosequencing genomic DNA extracts from soil&rft.identifier=https://data.aad.gov.au/metadata/records/ASAC_1163_pyrosequencing&rft.publisher=Australian Antarctic Data Centre&rft.description=This dataset contains information obtained by pyrosequencing genomic DNA extracts from soil with PCR primers targeting the bacterial 16S gene (27F/519R) and fungal ITS region (ITS1/ITS4-B). The data were processed in a pipeline using freely available 'mothur' software (v1.24.1). The reads were processed in 4 ways, this was a combination of subsampling the data to a number that normalised all but the 10 lowest samples and excluding operational taxonomic units (OTUs) that only occurred once in the entire dataset. This resulted in designations FULL_READS for the unsubsampled analyses, and SUBSAMPLED for those subsampled. Then SINGLETONS_INCLUDED (or SING_INC) for analyses where singleton OTUs were included and SINGLETONS_EXCLUDED (or SING_EXC) for those where dataset wide singletons were removed. For each analysis, this produced a .fasta (sequence info), .names (sequence redundancies) and a .groups (sequence to sample assignment) (in the chimera checked data fo lders) For each of these combinations an OTU abundance matrix was generated that has a .shared extension, which is a table of OTU by samples and the corresponding abundance of the OTU. Various alpha and beta diversity measure were calculated for each analysis, including diversity indices (extension .groups.summary), catchall (various .csv files), rarefaction data for each sample (extension .rarefaction), relative and species abundance data (extension .rabund and .sabund), unifrac community similarity measures (contained in the unifrac folder, are a distance matrix of the sample-by-sample dissimilarity, and a list .summary of the dissimilarities, in addition a neighbour-joining tree of the entire dataset from which the unifrac measures are calculated). In addition to these diversity measures, taxonomy was defined by bayesian searching of each OTU sequence against the GreenGenes database (2011 version, McDonald et al 2011, ISME J) this is provided as a .taxonomy and .summary file in the taxonomy folders. Also representative sequences for each OTU are in the OTU_rep folders as .fasta and .names files. The raw data are provided in the preprocessing files folders.&rft.creator=SNAPE, IAN &rft.creator=SICILIANO, STEVEN &rft.creator=LAGEREWSKIJ, GREG &rft.date=2012&rft.coverage=northlimit=-66.0; southlimit=-67.0; westlimit=110.0; eastLimit=111.0; projection=WGS84&rft.coverage=northlimit=-66.0; southlimit=-67.0; westlimit=110.0; eastLimit=111.0; projection=WGS84&rft.coverage=northlimit=-54.62; southlimit=-54.621; westlimit=158.861; eastLimit=158.862; projection=WGS84&rft.coverage=northlimit=-54.62; southlimit=-54.621; westlimit=158.861; eastLimit=158.862; projection=WGS84&rft_rights=This data set conforms to the CCBY Attribution License (http://creativecommons.org/licenses/by/4.0/). Please follow instructions listed in the citation reference provided at http://data.aad.gov.au/aadc/metadata/citation.cfm?entry_id=ASAC_1163_pyrosequencing when using these data.&rft_subject=environment&rft_subject=SOILS&rft_subject=EARTH SCIENCE&rft_subject=LAND SURFACE&rft_subject=SOIL CHEMISTRY&rft_subject=CONTAMINANT LEVELS/SPILLS&rft_subject=HUMAN DIMENSIONS&rft_subject=ENVIRONMENTAL IMPACTS&rft_subject=ECOTOXICOLOGY&rft_subject=BIOSPHERE&rft_subject=ECOLOGICAL DYNAMICS&rft_subject=TOXICITY LEVELS&rft_subject=Ecotoxicology&rft_subject=soil&rft_subject=FIELD SURVEYS&rft_subject=LABORATORY&rft_subject=FIELD INVESTIGATION&rft_subject=CONTINENT > ANTARCTICA > Windmill Islands&rft_subject=OCEAN > SOUTHERN OCEAN > MACQUARIE ISLAND&rft_subject=GEOGRAPHIC REGION > POLAR&rft_place=Hobart&rft.type=dataset&rft.language=English Access the data

Licence & Rights:

view details

This data set conforms to the CCBY Attribution License (http://creativecommons.org/licenses/by/4.0/). Please follow instructions listed in the citation reference provided at http://data.aad.gov.au/aadc/metadata/citation.cfm?entry_id=ASAC_1163_pyrosequencing when using these data.

Access:

Open view details

These data are publicly available from the Australian Antarctic Data Centre. Because the file size is large, they are stored offline, and made available via FTP on request.

Brief description

This dataset contains information obtained by pyrosequencing genomic DNA extracts from soil with PCR primers targeting the bacterial 16S gene (27F/519R) and fungal ITS region (ITS1/ITS4-B). The data were processed in a pipeline using freely available 'mothur' software (v1.24.1). The reads were processed in 4 ways, this was a combination of subsampling the data to a number that normalised all but the 10 lowest samples and excluding operational taxonomic units (OTUs) that only occurred once in the entire dataset. This resulted in designations FULL_READS for the unsubsampled analyses, and SUBSAMPLED for those subsampled. Then SINGLETONS_INCLUDED (or SING_INC) for analyses where singleton OTUs were included and SINGLETONS_EXCLUDED (or SING_EXC) for those where dataset wide singletons were removed. For each analysis, this produced a .fasta (sequence info), .names (sequence redundancies) and a .groups (sequence to sample assignment) (in the chimera checked data fo lders) For each of these combinations an OTU abundance matrix was generated that has a .shared extension, which is a table of OTU by samples and the corresponding abundance of the OTU. Various alpha and beta diversity measure were calculated for each analysis, including diversity indices (extension .groups.summary), catchall (various .csv files), rarefaction data for each sample (extension .rarefaction), relative and species abundance data (extension .rabund and .sabund), unifrac community similarity measures (contained in the unifrac folder, are a distance matrix of the sample-by-sample dissimilarity, and a list .summary of the dissimilarities, in addition a neighbour-joining tree of the entire dataset from which the unifrac measures are calculated). In addition to these diversity measures, taxonomy was defined by bayesian searching of each OTU sequence against the GreenGenes database (2011 version, McDonald et al 2011, ISME J) this is provided as a .taxonomy and .summary file in the taxonomy folders. Also representative sequences for each OTU are in the OTU_rep folders as .fasta and .names files. The raw data are provided in the preprocessing files folders.

Issued: 2012-09-26

Data time period: 2005-01-01 to 2011-12-31

This dataset is part of a larger collection

Click to explore relationships graph

111,-66 111,-67 110,-67 110,-66 111,-66

110.5,-66.5

158.862,-54.62 158.862,-54.621 158.861,-54.621 158.861,-54.62 158.862,-54.62

158.8615,-54.6205

text: northlimit=-66.0; southlimit=-67.0; westlimit=110.0; eastLimit=111.0; projection=WGS84

text: northlimit=-54.62; southlimit=-54.621; westlimit=158.861; eastLimit=158.862; projection=WGS84

Other Information
Identifiers