-
Notifications
You must be signed in to change notification settings - Fork 1
Home
This panel shows the cumulative probability and density of pLI for two parameter combinations (blue line: s=0.1,h=0.9; red line: s=0.9,h=0.1) for a hypothetical gene with 225 PTV mutational opportunities and a mutation rate of 1.5e-8 under a constant population size of 100,000 individuals. To simulate these parameters:
PTV_count_simulations 1.5e-8 225 0.1 0.9 1000000 0 100000
PTV_count_simulations 1.5e-8 225 0.9 0.1 1000000 0 100000
This panel is identical to 1B, except that instead of a constant population size, the demographic model of Schiffels & Durbin was used. To simulate these parameters:
PTV_count_simulations 1.5e-8 225 0.1 0.9 1000000 1 0
PTV_count_simulations 1.5e-8 225 0.9 0.1 1000000 1 0
This shows the probability of observing a PTV count of 3 (generated from a single simulation with s=0.1,h=0.9 and the same mutational and demographic parameters as 1B) for a grid of h and s values. To generate a single "observed" PTV count:
PTV_count_simulations 1.5e-8 225 0.1 0.9 1 1 0
Then, 1000000 simulations were run for each s and h parameter combination in the grid s=[0.01,0.02...0.99,1] and h=[0,0.01...0.99,1]. The output of each simulation was stored in a 3d Python numpy array, which was then used to estimate the likelihood of observing a PTV count of 3 across the parameter space using the script ptv_count_lkhd.py
.
This panel shows the behavior of pLI as a function of hs. The three lines correspond to three different gene lenths of PTV mutational opportunities (cyan=112, purple=225, yellow=550). The calculation of pLI requires an "expected" number of PTVs. To obtain a neutral expected number of PTVs for each gene length, we ran 1000000 simulation replicates with s=0,h=0:
PTV_count_simulations 1.5e-8 112 0 0 1000000 1 0
PTV_count_simulations 1.5e-8 225 0 0 1000000 1 0
PTV_count_simulations 1.5e-8 550 0 0 1000000 1 0
We took the mean number of PTVs from the set of neutral simulation replicates as the expected number of PTVs for each gene length. Then, similar to 1C, 5000000 simulations were run for each s and h parameter combination in the grid of values s=[0.01,0.02...0.99,1] and h=[0.01,0.02...0.99,1]. The simulation output for identical values of hs (e.g. s=0.1,h=.5 & s=0.5,h=.1) were concatenated together to bring the total number of simulations to 1000000. For each value of hs, we then calculated pLI for each "observed" replicate count of PTVs contained in the output using the R function calc_pli()
.
This panel depicts the behavior of pLI for a single value of hs, in this case s=0.1, h=.05 (hs=0.005). The blue histogram shows the distribution of PTV counts for these selection parameters for a gene with 225 PTV mutational opportunities, u=1.5e-8, and the demographic model of Schiffels & Durbin. To generate this distribution:
PTV_count_simulations 1.5e-8 225 0.1 0.5 1000000 1 0
To generate the expected number of PTVs under neutrality required by the calculation of pLI, a similar set of simulations were run, but with h=0 and s=0:
PTV_count_simulations 1.5e-8 225 0 0 1000000 1 0
The mean of this distribution (M=18) was then used as the expected number of PTVs in the calculation of pLI. The red line shows what the value of pLI is for possible observed PTV counts for this expected value. To generate these values, the following R code was used:
pli_scores<-vector()
count=1
for(i in seq(0,20)){
pli_scores[count]<-calc_pli(i,18)
count=count+1
}
The inset in the plot shows the density of pLI scores calculated for each replicate in the simulations.
The final inset shows the distribution of PTV counts for three different selection parameter combinations of a gene with 225 PTV mutational opportunities, u=1.5e-8, and the Schiffels-Durbin demographic model. The selection paramters correspond to a neutral gene (s=0,h=0), completely recessive (s=.1,h=0), and a weakly selected dominant gene (s=.001,h=1). To generate the distributions for the three parameter combinations:
PTV_count_simulations 1.5e-8 225 0 0 1000000 1 0
PTV_count_simulations 1.5e-8 225 0.1 0 1000000 1 0
PTV_count_simulations 1.5e-8 225 0.001 1 1000000 1 0