Data

Matlab code for Bayesian sparse factor analysis of mutation accumulation lines

The University of Queensland
Dr Emma Hine (Aggregated by) Dr Emma Hine (Aggregated by)
Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]
ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=info:doi10.14264/uql.2018.121&rft.title=Matlab code for Bayesian sparse factor analysis of mutation accumulation lines&rft.identifier=10.14264/uql.2018.121&rft.publisher=The University of Queensland&rft.description=The compressed folder contains two folders and the file 'readme.txt' which contains the following information. The input data for these analyses can be downloaded at doi:10.14264/uql.2017.783 The folder ‘BSF_matlab_code’ contains the code that actually samples from the prior distributions and saves posterior samples etc. These files will not need to be edited. The folder ‘setup_files’ contains the code used to specify starting values and hyper parameters as well as randomise data where appropriate for the different types of analyses. These files may need to be edited to change names of directories. - Each .m file described below took the same 82x16925 data file as its input. They call functions in the folder BSF_matlab_code, whose path needs to be specified in the Matlab environment. - prior_runs.m, prior_continue.m, random_runs.m and shuffle_runs.m were run on high-performance compute clusters (Euramoo, now decommissioned, and Awoonga). The scripts random_runs.m and shuffle_runs.m were run on Awoonga, a new cluster that was relatively unstable when we were conducting these analyses, so we set up 140 runs initially to ensure we ended up with at least 100 sets of results. - obs_continue_1.m and obs_continue_2.m were run on a MacBook Pro. prior_runs.m - This script was run five times (with different random seeds) for each of the nine hyper parameter combinations listed in Table S2, with burn=200000, thin=100, sp=1000. prior_continue.m - This script was run on to continue the chains from prior_runs.m, clearing the previously retained posterior samples and setting burn=0, thin=100, sp=1000. obs_continue_1.m - This script continued one particular chain from the set of continued prior runs (the ‘observed’ run) for a further 100,000 samples (discarding the previously retained posterior samples). obs_continue_2.m - This script continued the observed chain, discarding the previously retained posterior samples and setting burn = 0, thin = 100 and sp = 200 (iterated through 5 times). This resulted in the final 1000 posterior samples (at a thinning rate of 100 and burn-in period of 500,000). random_runs.m - This script randomised the data (type 1 randomisation) then applies the BSF model to collect 1000 posterior samples with thinning rate 100 after a burn-in period of 200,000. random_runs.m - This script randomised the data (type 2 randomisation) then applies the BSF model to collect 1000 posterior samples with thinning rate 100 after a burn-in period of 200,000.&rft.creator=Dr Emma Hine&rft.creator=Dr Emma Hine&rft.date=2018&rft_rights= https://creativecommons.org/licenses/by_nc_sa/3.0/deed.en&rft_subject=eng&rft_subject=Quantitative Genetics (incl. Disease and Trait Mapping Genetics)&rft_subject=BIOLOGICAL SCIENCES&rft_subject=GENETICS&rft_subject=Biological Adaptation&rft_subject=EVOLUTIONARY BIOLOGY&rft.type=dataset&rft.language=English Access the data

Licence & Rights:

Non-Commercial Licence view details

Access:

Open

Contact Information

[email protected]

Full description

The compressed folder contains two folders and the file 'readme.txt' which contains the following information. The input data for these analyses can be downloaded at doi:10.14264/uql.2017.783 The folder ‘BSF_matlab_code’ contains the code that actually samples from the prior distributions and saves posterior samples etc. These files will not need to be edited. The folder ‘setup_files’ contains the code used to specify starting values and hyper parameters as well as randomise data where appropriate for the different types of analyses. These files may need to be edited to change names of directories. - Each .m file described below took the same 82x16925 data file as its input. They call functions in the folder BSF_matlab_code, whose path needs to be specified in the Matlab environment. - prior_runs.m, prior_continue.m, random_runs.m and shuffle_runs.m were run on high-performance compute clusters (Euramoo, now decommissioned, and Awoonga). The scripts random_runs.m and shuffle_runs.m were run on Awoonga, a new cluster that was relatively unstable when we were conducting these analyses, so we set up 140 runs initially to ensure we ended up with at least 100 sets of results. - obs_continue_1.m and obs_continue_2.m were run on a MacBook Pro. prior_runs.m - This script was run five times (with different random seeds) for each of the nine hyper parameter combinations listed in Table S2, with burn=200000, thin=100, sp=1000. prior_continue.m - This script was run on to continue the chains from prior_runs.m, clearing the previously retained posterior samples and setting burn=0, thin=100, sp=1000. obs_continue_1.m - This script continued one particular chain from the set of continued prior runs (the ‘observed’ run) for a further 100,000 samples (discarding the previously retained posterior samples). obs_continue_2.m - This script continued the observed chain, discarding the previously retained posterior samples and setting burn = 0, thin = 100 and sp = 200 (iterated through 5 times). This resulted in the final 1000 posterior samples (at a thinning rate of 100 and burn-in period of 500,000). random_runs.m - This script randomised the data (type 1 randomisation) then applies the BSF model to collect 1000 posterior samples with thinning rate 100 after a burn-in period of 200,000. random_runs.m - This script randomised the data (type 2 randomisation) then applies the BSF model to collect 1000 posterior samples with thinning rate 100 after a burn-in period of 200,000.

Issued: 2018

This dataset is part of a larger collection

Subjects

User Contributed Tags    

Login to tag this record with meaningful keywords to make it easier to discover

Other Information
The nature and extent of mutational pleiotropy in gene expression of male Drosophila serrata

local : UQ:329791

McGuigan, Katrina, Collet, Julie M., McGraw, Elizabeth A., Ye, Yixin H., Allen, Scott L., Chenoweth, Stephen F. and Blows, Mark W. (2014). The nature and extent of mutational pleiotropy in gene expression of male Drosophila serrata. Genetics, 196 (3), 911-921. doi: 10.1534/genetics.114.161232

Uneven distribution of mutational variance across the transcriptome of Drosophila serrata revealed by high-dimensional analysis of gene expression

local : UQ:a42903d

Hine, Emma, Runcie, Daniel E., McGuigan, Katrina and Blows, Mark W. (2018). Uneven distribution of mutational variance across the transcriptome of Drosophila serrata revealed by high-dimensional analysis of gene expression. Genetics, 209 (4), 1319-1328. doi: 10.1534/genetics.118.300757

Research Data Collections

local : UQ:289097

Identifiers