BeviMed
, which stands for Bayesian Evaluation of Variant Involvement in Mendelian Disease, is an association test which estimates both the probability of an association between a given set of variants and a case/control disease label, and, given the association, the probability that each individual variant is pathogenic with respect to the disease.
The inference is carried out based on the inputs:
y
, a length N
(number of samples) logical
vector,G
, an N
by k
integer matrix of allele counts for N
individuals at k
rare variant sites,min_ac
, representing a mode of inheritance hypothesis (i.e. minimum number of pathogenic variants required to be considered to have a pathogenic configuration of variants).Then, depending on the quantity of interest, the inference procedure can be invoked simply by passing the above arguments to the functions:
prob_association
- returning the probability of association between configurations of variants represented in G
and the case-control label y
(optionally broken down by mode of inheritance). By default, the prior probability of association is 0.01
, and the prior probabilities of dominant and recessive inheritance given that there is an association are each 0.5
.log_BF
- the log Bayes factor between the association model and no-association model.prob_pathogenic
- the probabilities of pathogenicity for the individual variants.The inference is performed by the function bevimed
, an MCMC sampling procedure with many parameters, including those listed above and others determining the sampling management and prior distributions of the model parameters.
It returns a list of traces for the sampled parameters in an object of class BeviMed
. This object can take up a lot of memory, so it may be preferable to store a summarised version passed to summary
.
Here we demonstrate a simple application of BeviMed for some simulated data.
library(BeviMed)
set.seed(0)
Firstly, we’ll generate some random data consisting of an allele-count matrix G
for 100 samples at 20 variant sites (each with an allele frequency of 0.02) and an independently generated case-control label, y_random
.
G <- matrix(rbinom(size=2, prob=0.02, n=100*20), nrow=100, ncol=20)
y_random <- runif(n=nrow(G)) < 0.1
prob_association(G=G, y=y_random)
## [1] 0.002853161
The results indicate that there is a low probability of association. We now generate a new case control label y_dependent
which depends on G
- specifically, we treat variants 1 to 3 as ‘pathogenic’, and label any samples harbouring alleles for any of these variants as cases.
y_dependent <- apply(G, 1, function(variants) sum(variants[1:3]) > 0)
prob_association(G=G, y=y_dependent)
## [1] 0.9997962
Notice that there is now a higher estimated probability of association.
By default, prob_association
integrates over mode of inheritance (e.g. are at least 1 or 2 pathogenic variants required for a pathogenic configuration?). The probabilities of association with each mode of inheritance can by shown by passing the option by_MOI=TRUE
(for more details, including how to set the ploidy of the samples within the region, see ?prob_pathogenic
).
For a more detailed output, the bevimed
function can be used, and it’s returned values can be summarised and stored/printed.
output <- summary(bevimed(G=G, y=y_dependent))
output
## ---------------------------------------------------------------------------
## The probability of association is 1 [prior: 0.01]
##
## The expected number of variants involved in explained cases is: 2.96
##
## Log Bayes factor between gamma 1 model and gamma 0 model is 13.91
## A confidence interval for the log Bayes factor is:
## 2.5% 97.5%
## 12.22 14.74
## ---------------------------------------------------------------------------
## Estimated probabilities of pathogenicity of individual variants
## (conditional on gamma = 1)
##
## Variant Controls Cases P(z_j=1|y,gamma=1) Bar Chart
## 1 0 2 1.00 [=================== ]
## 2 0 4 1.00 [=================== ]
## 3 0 2 0.94 [================== ]
## 4 9 0 0.00 [ ]
## 5 2 1 0.00 [ ]
## 6 4 0 0.00 [ ]
## 7 3 0 0.00 [ ]
## 8 2 1 0.00 [ ]
## 9 4 1 0.00 [ ]
## 10 7 0 0.00 [ ]
## 11 3 0 0.00 [ ]
## 12 1 0 0.01 [ ]
## 13 4 0 0.00 [ ]
## 14 1 2 0.01 [ ]
## 15 3 0 0.00 [ ]
## 16 5 0 0.00 [ ]
## 17 7 0 0.00 [ ]
## 18 4 0 0.00 [ ]
## 19 7 2 0.00 [ ]
## 20 6 0 0.00 [ ]