Protein phosphorylation is an important reversible mechanism in post-translational modifications of proteins,
and it affects a lot of kinds of essential cellular processes. Due to the importance of protein phosphorylation
in cellular control, there are many schemes and models to predict the catalytic kinase-specific phosphorylation
sites. Most of methods are based on the consensus sequences of position probabilities, just like our previous
version KinasePhos 1.0, which is also a web server based on the consensus. The known phosphorylation sites from
public domain data sources are categorized by their annotated protein kinases. In the previous version, feature
based on the profile hidden Markov model, and computational models are learned from the kinase-specific groups
of the phosphorylation sites. After evaluating the learned models, the model with highest accuracy was selected
from each kinase-specific group, for using in a web-based prediction tool for identifying protein phosphorylation
sites. It is a kinase-specific phosphorylation site prediction tool with both high sensitivity and specificity.
Moreover, the current release of KinasePhos, version 2.0, adapts the sequence-based amino acid coupling-pattern
analysis and solvent accessibility as new features for SVM (support vector machine) to characterize the phosphorylation site.
The feature of coupling-pattern [XdZ] denotes the amino acid coupling-pattern of amino acid types X and Z that are separated
by d amino acids. We use the coupling strength CXdZ defined by coupling-pattern analysis, and we compute the differences
between positive and negative set of phosphorylation proteins. We select the features which are top 250 differences
of CXdZ. Then build SVM (support vector machine) to build the models and performed the cross validation.
It is about 95% prediction accuracy that using this prediction model and gets 7% more improvement than previous version.
Compared with other tools, the special features chosen for SVM model-building produces the best prediction so far.