Data

De novo transcriptome assemblies for a set of 40 diverse native Australian plants species

Commonwealth Scientific and Industrial Research Organisation
Andrew, Sam ; Coppin, Chris ; Mokany, Karel
Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]
ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=info:doi10.25919/6mq7-ya76&rft.title=De novo transcriptome assemblies for a set of 40 diverse native Australian plants species&rft.identifier=https://doi.org/10.25919/6mq7-ya76&rft.publisher=Commonwealth Scientific and Industrial Research Organisation&rft.description=The Authors produced RNA-seq data for two experiments that looked at transcriptomic responses to temperature stress in native Australian plants. The study species were sourced from a range or Australin environments. From these temperature stress experiments a single RNA-seq library per species was used for each of the 40 de novo assemblies. \nThe “De_novo_metadata_SCA.xlsx” file provides additional data about the reference transcriptomes from the “Acacia” and “LOTE” experiments.\nThe names of the input RNA-seq libraries are included in the metadata file. The “Acacia” sequence data are made publicly available on the CSIRO’s Data Access Portal (part 1 = https://doi.org/10.25919/wa3k-0x90 and part 2 = https://doi.org/10.25919/ryte-pk64). The RNA-seq library metadata, processed gene expression data, gene annotations, and additional data for published analyses are also made available with all Rmarkdown scripts on CSIRO’s Data Access Portal (https://doi.org/10.25919/mpk5-xr92). The LOTE raw sequence data and meta data are also made publicly available on the CSIRO’s Data Access Portal (https://doi.org/10.25919/p00h-p990)\n\nLineage: The following method was used to prepare de novo transcriptome assemblies for selected samples. Quality assessment and filtering of reads was first done with the fastp program using default settings and on average 98 % of reads were retained after filtering. For each species, an RNA-seq library with a high number of reads was selected for de novo transcriptome assembly with Trinity (version 2.11.0) and associated packages. A single individual was used per assembly to prevent genetic variation causing issues for transcript assembly when merging data from multiple individuals. Prior to the Trinity assembly the paired-end reads were further trimmed using Trimmomatic using the following settings “HEADCROP:13 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:50”. This step helps remove base calls at the start and the end of the reads that have a higher probability of being errors and therefore can be highly detrimental to the assembly process. \nTrinity assemblies were assessed with BUSCO (version 5.1.2), using the “fabales_odb10” reference for the “Acacia” study and the “viridiplantae_odb10” reference for the “LOTE” study, with standard settings to check assemble completeness (see metadata for results “De_novo_metadata_SCA.xlsx”). The assembled transcripts were annotated using TransDecoder (version 5.5.0) with Trinotate (version 3.2.1). With Trinotate, the sequences were used to search for transcript functional annotations using blastp, blastx, domtbl, and signapl (see .tsv files for results).\n&rft.creator=Andrew, Sam &rft.creator=Coppin, Chris &rft.creator=Mokany, Karel &rft.date=2024&rft.edition=v1&rft_rights=Creative Commons Attribution 4.0 International Licence https://creativecommons.org/licenses/by/4.0/&rft_rights=Data is accessible online and may be reused in accordance with licence conditions&rft_rights=All Rights (including copyright) CSIRO 2024.&rft_subject=Transcriptomics&rft_subject=genomics&rft_subject=Australia&rft_subject=plants&rft_subject=temperature stress&rft_subject=climate change&rft_subject=Genomics and transcriptomics&rft_subject=Bioinformatics and computational biology&rft_subject=BIOLOGICAL SCIENCES&rft_subject=Evolutionary ecology&rft_subject=Evolutionary biology&rft.type=dataset&rft.language=English Access the data

Licence & Rights:

Open Licence view details
CC-BY

Creative Commons Attribution 4.0 International Licence
https://creativecommons.org/licenses/by/4.0/

Data is accessible online and may be reused in accordance with licence conditions

All Rights (including copyright) CSIRO 2024.

Access:

Open view details

Accessible for free

Contact Information



Brief description

The Authors produced RNA-seq data for two experiments that looked at transcriptomic responses to temperature stress in native Australian plants. The study species were sourced from a range or Australin environments. From these temperature stress experiments a single RNA-seq library per species was used for each of the 40 de novo assemblies.
The “De_novo_metadata_SCA.xlsx” file provides additional data about the reference transcriptomes from the “Acacia” and “LOTE” experiments.
The names of the input RNA-seq libraries are included in the metadata file. The “Acacia” sequence data are made publicly available on the CSIRO’s Data Access Portal (part 1 = https://doi.org/10.25919/wa3k-0x90 and part 2 = https://doi.org/10.25919/ryte-pk64). The RNA-seq library metadata, processed gene expression data, gene annotations, and additional data for published analyses are also made available with all Rmarkdown scripts on CSIRO’s Data Access Portal (https://doi.org/10.25919/mpk5-xr92). The "LOTE" raw sequence data and meta data are also made publicly available on the CSIRO’s Data Access Portal (https://doi.org/10.25919/p00h-p990)

Lineage: The following method was used to prepare de novo transcriptome assemblies for selected samples. Quality assessment and filtering of reads was first done with the fastp program using default settings and on average 98 % of reads were retained after filtering. For each species, an RNA-seq library with a high number of reads was selected for de novo transcriptome assembly with Trinity (version 2.11.0) and associated packages. A single individual was used per assembly to prevent genetic variation causing issues for transcript assembly when merging data from multiple individuals. Prior to the Trinity assembly the paired-end reads were further trimmed using Trimmomatic using the following settings “HEADCROP:13 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:50”. This step helps remove base calls at the start and the end of the reads that have a higher probability of being errors and therefore can be highly detrimental to the assembly process.
Trinity assemblies were assessed with BUSCO (version 5.1.2), using the “fabales_odb10” reference for the “Acacia” study and the “viridiplantae_odb10” reference for the “LOTE” study, with standard settings to check assemble completeness (see metadata for results “De_novo_metadata_SCA.xlsx”). The assembled transcripts were annotated using TransDecoder (version 5.5.0) with Trinotate (version 3.2.1). With Trinotate, the sequences were used to search for transcript functional annotations using blastp, blastx, domtbl, and signapl (see .tsv files for results).

Available: 2024-07-13

Data time period: 2019-08-08 to 2021-07-30