modiff
looks for differentially methylated loci.
By default it does a two group tests by using Bayes factors with a binomial model for the counts. Other models can be specified.
Usage is as folllows:
modiff [--model][--groups] FILE
where FILE
can either be:
-
A TAB-separated file containing modified bases counts. The columns are contig and genomic coordinate followed by depth and number of modified bases per each sample. For example with 2 samples an input line looks like the following:
chr1\t10471\t20\t7\t23\t19
In this case
modiff
can read directly from thestdin
if the file name is-
.
or
- a bgzip compressed, tabix indexed TAB-separated file, organized as above.
the tabix format is automatically recognized by
modiff
.
--model
choose the model used to compare the two groups.
(see below)
--groups
specify the which sample belongs to which group.
Unless the --groups
option is set, the first half of the count
columns are assumed to belong to group 0 and the remaining
half to group 1; otherwise groups
can be used to assign samples to
groups, eg with 4 samples one can call
modiff --groups 0:1:1:0 samples.txt
-
logreg
corresponds to running a binomial logistic regression with an intercept term and a group term, and testing the significance of the group term -
lr
corresponds to computing the log likelihood ratio between a model with two parameters (one per group) and a model with a single parameter (across groups). Highest values oflr
give credibility to the two parameters model. -
bf
computes the log of the Bayes factor between a two parameter model and a one parameter model (integrating over the parameters).
if no model is specified through --model
the default is lr
.
modiff
produces an output file compliant with the BEDn
specifications https://samtools.github.io/hts-specs/BEDv1.pdf
(0-based, half open coordinate system).
The output format changes depending on the model used, but the first few columns are always the same.
column | quantity | comment |
---|---|---|
1 | contig/chromosome | reference sequence |
2 | start position | (inclusive, 0-based) |
3 | end position | (exclusive, 0-based) |
4 | mean methylation by group | comma separated |
5 | depth by group | comma separated |
in the other model the other columns are to be documented but the last column always contain an indicator of strenght of effect or significance (pvalue or Bayes factor).
github issues and/or pull requests. email to emanuele dot raineri at cnag dot eu