-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the LmTag wiki!
A. The size of the simulated SNP array
In general, the maximum number of tag SNP selected by LmTag when it became saturated (when no more tag SNP satisfies the algorithm conditions) is expected comparable with TagIt as they are all greedy-based algorithms. Theoretically, the model parameters are sensitive to the density of the simulated array. Thus, we recommend users sub-sampled k SNPs as tag SNP set in simulation studies based on the maximum number of tag SNP selected by TagIt. A wrapped pipeline and instructions to run TagIt can be found here: https://github.com/datngu/TagSNP_evaluation
B. Beam-width parameter K setting
We have tried various K settings in our study. Taken the trade-offs between functional enrichment versus imputation accuracy and computation time. We recommend the K setting in the range of 200-500 as a compromised setting that provides both high functional enrichment, and superior imputation accuracy while remaining low computational. However, researchers can set it as their need based on what they want to optimize.
C. Selecting top tag SNP
As the target array size can be arbitrary based on researchers, in case the array size is smaller than the number of tag SNP selected by LmTag, we recommend researchers to use the field "sum_score" of LmTag to rank tag SNPs. The higher scores should be kept while the lowest scores should be removed first.