×

Bayesian hypothesis testing for motifs in biological sequences based on the maximum-likelihood criterion. (Chinese. English summary) Zbl 1230.62137

Summary: For significant testing of motifs in biological sequences, Bayesian hypothesis testing based on the maximum likelihood criterion is presented. This significant testing of multiple motifs is converted into a goodness of fit test of the multinomial distribution. While the prior distribution of the multinomial distribution is known as Dirichlet, the estimates of super-parameters of the Dirichlet prior distribution are given using the Newton-Raphson algorithm for maximization of the predictive distribution of the data. Based on the Bayes’ theorem, a Bayes factor is obtained for model selection, which acts as statistical estimation of the significance. The method overcomes the difficulty of constructing statistical tests and deriving their exact distributions of the null hypotheses. Selecting 107 alignments of transcription factor binding sites in the JASPAR database and 100 Tandom generated alignments as experimental data, taking Pearson’s product moment correlation coefficients as an objective criterion of the quality of estimation, experimental results indicate that Bayesian testing performed better on average than the classical methods.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62F15 Bayesian inference
92B05 General biology and biomathematics
92C40 Biochemistry, molecular biology
65C60 Computational problems in statistics (MSC2010)
PDFBibTeX XMLCite