Data

An assessment of the complexity of 3' UTRs relative to that of protein-coding sequences: models selected using two procedures

Queensland University of Technology
Algama, Manjula ; Oldmeadow, Christopher ; Tasker, Edward ; Mengersen, Kerrie ; Keith, Jonathan
Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]
ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=info:doi10.4225/09/585740a0f07ac&rft.title=An assessment of the complexity of 3' UTRs relative to that of protein-coding sequences: models selected using two procedures &rft.identifier=10.4225/09/585740a0f07ac&rft.publisher=Queensland University of Technology&rft.description=The dataset comes from a study which assessed the complexity of 3′ UTRs (three prime untranslated regions) relative to that of protein-coding sequences, by comparing the extent to which segmental substructures can be detected within these two genomic fractions based on sequence composition and conservation. For the dataset, two different procedures were applied to select the number of classes for each alignment; investigating Deviance Information Criterion V (DICV) values (Procedure 1) and investigating the stability of the classes (Procedure 2). The numbers of classes selected for each sequence by each procedure are summarised. The data indicates that twelve to fourteen segment classes with distinct character frequencies can be distinguished in each of the three coding sequence alignments, using Procedure 1 or Procedure 2. &rft.creator=Algama, Manjula &rft.creator=Oldmeadow, Christopher &rft.creator=Tasker, Edward &rft.creator=Mengersen, Kerrie &rft.creator=Keith, Jonathan &rft.date=2014&rft.edition=1&rft.coverage=159.255525,-9.219822 112.921454,-9.219822 112.921454,-54.777218 159.255525,-54.777218 159.255525,-9.219822&rft_rights=© 2014 Algama et al. &rft_rights=Creative Commons Attribution 3.0 http://creativecommons.org/licenses/by/3.0/au/&rft_subject=Genome evolution&rft_subject=Biostatistics&rft_subject=Genetics&rft_subject=Molecular biology&rft_subject=Comparative genomics&rft_subject=Markov models&rft_subject=Sequencing techniques&rft_subject=Bayes theorem&rft_subject=Genome complexity&rft_subject=Molecular biology techniques&rft_subject=MATHEMATICAL SCIENCES&rft_subject=Probability theory&rft_subject=Genome analysis&rft_subject=Sequence analysis&rft_subject=Computational biology&rft.type=dataset&rft.language=English Access the data

Licence & Rights:

Open Licence view details
CC-BY

Creative Commons Attribution 3.0
http://creativecommons.org/licenses/by/3.0/au/

© 2014 Algama et al.

Access:

Other

Contact Information

Postal Address:
Kerrie Mengersen

k.mengersen@qut.edu.au

Full description

The dataset comes from a study which assessed the complexity of 3′ UTRs (three prime untranslated regions) relative to that of protein-coding sequences, by comparing the extent to which segmental substructures can be detected within these two genomic fractions based on sequence composition and conservation.

For the dataset, two different procedures were applied to select the number of classes for each alignment; investigating Deviance Information Criterion V (DICV) values (Procedure 1) and investigating the stability of the classes (Procedure 2). The numbers of classes selected for each sequence by each procedure are summarised.

The data indicates that twelve to fourteen segment classes with distinct character frequencies can be distinguished in each of the three coding sequence alignments, using Procedure 1 or Procedure 2.

Data time period: 2013 to 31 12 2013

This dataset is part of a larger collection

Click to explore relationships graph

159.25553,-9.21982 112.92145,-9.21982 112.92145,-54.77722 159.25553,-54.77722 159.25553,-9.21982

136.0884895,-31.99852

Identifiers