Issue April 2011No. 5 (p 675-843) April 2011 ISSN 0739-1102 Predicting Sumoylation Site by Feature Selection MethodThe small ubiquitin-like modifier (SUMO) proteins are a kind of proteins that can be attached to a series of proteins. The sumoylation of protein is an important posttranslational modification. Thus, the prediction of the sumoylation site of a given protein is significant. Here we employed a combined method to perform this task. We predicted the sumoylation site of a protein by a two-staged procedure. At the first stage, whether a protein would be sumoylated was predicted; whereas at the second stage, the sumoylation sites of the protein were predicted if it was determined to be modified by SUMO at the first stage. At the first stage, we encoded a protein with protein families (PFAM) and trained the predictor with nearest network algorithm (NNA); at the second stage, we encoded nonapeptides (peptides that contain nine residues) of the protein containing the lysine residues, with Amino Acid Index, and trained the predictor with NNA. The predictor was tested by the k-fold cross-validation method. The highest accuracy of the second-staged predictor was 99.55% when 12 features were incorporated in the predictor. The corresponding Matthews Correlation Coefficient was 0.7952. These results indicate that the method is a promising tool to predict the sumoylation site of a protein. At last, the features used in the predictor are discussed. The software is available at request.
Key words: Sumoylation site; Minimum Redundancy Maximum Relevance; Feature Selection; Nearest Neighbor Algorithm; K-fold cross-validation. This article can be cited as: Y. Cai, J. He, L. Lu, Predicting Sumoylation Site by Feature Selection Method, J. Biomol Struct Dyn 28(5) 797-804 (2011) YuDong Cai1* 1Institute of System Biology, Shanghai University, 99 Shangda Road, Shanghai, 200244, China Subscription is more cost effective than purchasing PDFs on-the-fly. Click here for details. |