Issue April 2011

category image Volume 28
No. 5 (p 675-843)
April 2011
ISSN 0739-1102

An Efficient Binomial Model-Based Measure for Sequence Comparison and its Application

Sequence comparison is one of the major tasks in bioinformatics, which could serve as evidence of structural and functional conservation, as well as of evolutionary relations. There are several similarity/dissimilarity measures for sequence comparison, but challenges remains. This paper presented a binomial model-based measure to analyze biological sequences. With help of a random indicator, the occurrence of a word at any position of sequence can be regarded as a random Bernoulli variable, and the distribution of a sum of the word occurrence is well known to be a binomial one. By using a recursive formula, we computed the binomial probability of the word count and proposed a binomial model-based measure based on the relative entropy. The proposed measure was tested by extensive experiments including classification of HEV genotypes and phylogenetic analysis, and further compared with alignment -based and alignment-free measures. The results demonstrate that the proposed measure based on binomial model is more efficient.

Key words: Word count; Binomial model; Sequence comparison; Classification; Phylogenetic analysis.

This article can be cited as:
X. Liu, Q. Dai, L. Li, Z. He, An Efficient Binomial Model-Based Measure for Sequence Comparison and its Application, J. Biomol Struct Dyn 28(5) 833-843 (2011)

Xiaoqing Liu1
Qi Dai2*
Lihua Li2
Zerong He1

1School of Science, Hangzhou Dianzi Unviersity, Hangzhou 310018, People’s Republic of China
2Institute of Biomedical Engineering and Instrumentation, Hangzhou Dianzi University, Hangzhou 310018, People’s Republic of China

daiailiu2004@yahoo.com.cn

Purchase Downloadable Full Text PDF of Article

Corporate User

$100.00

University/Academic User

$50.00

Subscription is more cost effective than purchasing PDFs on-the-fly.  Click here for details.