Data
Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]
ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=info:doi10.25451/flinders.19770379.v4&rft.title=Bacillus Carbohydrate Metabolism Protvec model&rft.identifier=https://doi.org/10.25451/flinders.19770379.v4&rft.publisher=Flinders University&rft.description=Protvec model  trained using 8,743 sequences from the Genome Taxonomy Database (GTDB). Sequences were filtered to remove sequences containing 'X', sequences shorter than 30 amino acids and sequences longer than 1024 amino acids.    Training used a vector size of 100 and a context size of 25 to produce a dictionary object containing a 100-dimensional vector for each 3-mer present in the training data.  Model is stored as a .pkl file which can be imported using the Python pickle module.&rft.creator=James G Mitchell&rft.creator=Jody C. McKerral&rft.creator=Robert Edwards&rft.creator=Susie Grigson&rft.date=2022&rft_rights=REUSABLE-FOR-ANY-PURPOSE-(CC-BY)&rft_subject=Protvec&rft_subject=Sequence embedding&rft_subject=Bioinformatics&rft_subject=Bioinformatics and computational biology not elsewhere classified&rft.type=dataset&rft.language=English Access the data

Licence & Rights:

Other view details
Reusable-for-any-purpose

REUSABLE-FOR-ANY-PURPOSE-(CC-BY)

Full description

Protvec model  trained using 8,743 sequences from the Genome Taxonomy Database (GTDB). Sequences were filtered to remove sequences containing 'X', sequences shorter than 30 amino acids and sequences longer than 1024 amino acids. 

 

Training used a vector size of 100 and a context size of 25 to produce a dictionary object containing a 100-dimensional vector for each 3-mer present in the training data. 


Model is stored as a .pkl file which can be imported using the Python pickle module.

Issued: 2022-05-16

Created: 2022-05-21

This dataset is part of a larger collection

Click to explore relationships graph
Subjects

User Contributed Tags    

Login to tag this record with meaningful keywords to make it easier to discover

Identifiers