The statistics of phosphorylation sites obtained from Phospho.ELM and Swiss-Prot.

Data source

Number of phosphorylated proteins

Number of phosphorylated sites

Serine (S)

Threonine (T)

Tyrosine (Y)

Histidine (H)

Total

Phospho.ELM

3,674

9,917

1,890

1,804

1

13,612

Swiss-Prot *

3,148

4,846

1,035

901

42

6,832

Combined (non-redundant)

5,842

11,888

2,433

2,179

43

16,551


 
The comparison among KinasePhos 2.0, DISPHOS, PredPhospho, GPS, PPSP and KinasePhos 1.0.

Tools

DISPHOS

PredPhospho

GPS

PPSP

KinasePhos 1.0

KinasePhos 2.0

Method

Logistic regression

SVM

MCL+GPS

BDT

MDD+HMM

CP+SVM

No. of kinases

-

4 groups

71 groups

68 groups

18

58

Kinase PKA

-

Sn = 0.88
Sp = 0.91

Sn = 0.89
Sp = 0.91

Sn = 0.90
Sp = 0.92

Sn = 0.91
Sp = 0.86

Sn = 0.92
Sp = 0.89

Kinase PKC

-

Sn = 0.79
Sp = 0.86

Sn = 0.82
Sp = 0.83

Sn = 0.82
Sp = 0.86

Sn = 0.80
Sp = 0.87

Sn = 0.84
Sp = 0.86

Kinase CK2

-

Sn = 0.84
Sp = 0.96

Sn = 0.83
Sp = 0.88

Sn = 0.83
Sp = 0.90

Sn = 0.87
Sp = 0.85

Sn = 0.87
Sp = 0.86

Serine

Acc = 0.76

Acc = 0.81

-

-

Acc = 0.86

Acc = 0.90

Threonine

Acc = 0.81

Acc = 0.77

-

-

Acc = 0.91

Acc = 0.93

Tyrosine

Acc = 0.83

-

-

-

Acc = 0.84

Acc = 0.88

Histidine

-

-

-

-

-

Acc = 0.93

Overall performance

-

Acc = 0.76~0.91

-

-

Acc = 0.87

Acc = 0.91


 

The statistics of non-redundant kinase-specific phosphorylation sites collected from Swiss-Prot and Phospho.ELM.


Non-redundant of Swiss-Prot + Phospho.ELM

Catalytic protein kinases

Number of
substrate sites

Serine

Threonine

Tyrosine

protein kinase A (PKA)

441

391

49

1

protein kinase C (PKC)

411

338

71

2

casein kinase 2 (CK2)

333

282

49

2

Cyclin-dependent Kinase (CDK)

325

204

121

0

Mitogen-activated protein kinases (MAPK)

288

200

88

0

CaM-kinase ( CaM )

106

85

21

0

protein Kinase B (PKB)

86

69

17

0

G protein-coupled receptor kinase (GRK)

84

58

26

0

Casein Kinase 1 (CK1)

77

56

20

1

Cdc2 Protein Kinase (CDC2)

73

46

27

0

glycogen synthase kinase-3 (GSK-3)

65

42

23

0

ataxia telangiectasia mutated (ATM)

59

54

5

0

I kappa B kinase (IKK)

45

45

0

0

Polo-like kinase 1 (PLK1)

45

33

12

0

Aurora-related kinase ( Aurora )

45

37

8

0

Ribosomal S6 Kinase (RSK)

41

39

2

0

phosphorylase kinase (PHK)

39

38

1

0

AMP-activated protein kinase (AMPK)

38

35

3

0

phosphoinositide-dependent protein kinase (PDK)

33

14

19

0

Rho-associated protein kinase (ROCK)

31

9

21

1

MAP kinase-activated protein kinase 2 (MAPKAPK2)

30

27

3

0

serine/threonine kinase 4 (STK4)

28

28

0

0

AKT1 kinase (AKT1)

28

19

9

0

p21-activated kinase 1 (PAK1)

26

21

5

0

cGMP-dependent protein kinas (PKG)

22

18

4

0

DNA dependent protein kinase (DNA-PK)

20

13

7

0

Death-associated protein kinase (DAPK)

20

9

11

0

Serine/threonine-protein kinase IPL1 (IPL1)

19

16

3

0

Serine/threonine-protein kinase Chk2 (CHK2)

17

12

5

0

Serine/threonine-protein kinase Chk1 (CHK1)

16

13

3

0

p21-activated kinase 2 (PAK2)

15

13

2

0

LKB1 kinase (LKB1)

15

1

14

0

Proto-oncogene tyrosine-protein kinase Src (Src)

145

0

0

145

lymphocyte specific protein tyrosine kinase (LCK)

60

6

5

49

Abl Protein Tyrosine Kinase (Abl)

55

0

0

55

epidermal growth factor receptor (EGFR)

52

0

0

52

Proto-oncogene tyrosine-protein kinase FYN (Fyn)

51

0

0

51

spleen tyrosine kinase (SYK)

49

0

0

49

Insulin receptor (InsR)

42

0

0

42

Tyrosine-protein kinase LYN (Lyn)

41

0

0

41

Janus kinase 2 (JAK2)

35

0

0

35

platelet derived growth factor receptor (PDGF)

25

0

0

25

Insulin-like growth factor I receptor (IGF1R)

24

0

0

24

anaplastic lymphoma kinase (ALK)

22

0

0

22

Ephrin receptor (EPH)

20

0

0

20

fibroblast growth factor receptor 1 (FGFR1)

20

0

0

20

Tyrosine-protein kinase ZAP-70 (ZAP70)

20

0

0

20

Insulin receptor (IR)

18

2

2

14

Met proto-oncogene tyrosine kinase (MET)

17

0

0

17

Bruton's tyrosine kinase (BTK)

16

0

0

16

TRK transforming tyrosine kinase protein (TRK)

15

0

0

15

Hemopoietic cell kinase (Hck)

14

2

3

9

Focal adhesion kinase (FAK)

13

0

0

13

Proto-oncogene tyrosine-protein kinase receptor ret (Ret)

13

0

0

13

C-SRC kinase (CSK)

9

0

0

9

Proto-oncogene tyrosine-protein kinase Fes/Fps ( Fes )

9

0

0

9

Proto-oncogene tyrosine-protein kinase FGR (Fgr)

8

0

0

8

Non-receptor tyrosine-protein kinase TYK2 (TYK2)

8

0

0

8

Total

3751

2285

678

788


The coding methods of sequence profiles.


Coding Method of Protein Sequence

Coding method

Description

Blosum62 profile encoding

Each amino acid is encoded corresponding to row number of BLOSUM62 matrix

Reduced alphabet

3 classes reduced

Polar PKEDQN
Neutral GASTPHY
Hydrophobic CVLIMFY

7 classes reduced

aliphatic AILVGP
acid DE
base HKR
aromatic FWY
amide NQ
small hydroxy ST
sulfur CM

8 classes reduced

Aliphatic 1 AGP
Aliphatic 2 ILV
Acid DE
Base HKR
Aromatic FWY
Amide NQ
Small hydroxy ST
Sulfur CM

20-dimensional vector

Each amino acid is mapped to a 20-dimensional vector, ex.
A 10000000000000000000
C 01000000000000000000
D 00100000000000000000

Y 00000000000000000001


The best models in each kinase-specific group with at least 20 experimental phosphorylated sites.

Kinase

Trained feature

Cost value

Gamma value

Prec

Sn

Sp

Acc

S_PKA (391)

CP ratio (2.2) + Seq (7 class)

0.5

0.125

0.89

0.91

0.89

0.90

S_PKC (338)

CP difference (1.9) + Seq (7 class)

0.5

0.125

0.86

0.81

0.87

0.84

S_CK2 (282)

Seq (7 class)

2

0.125

0.88

0.84

0.88

0.86

S_CDK (204)

CP ratio (2.1) + Seq (7 class)

0.5

0.125

0.85

0.95

0.83

0.89

S_MAPK (200)

CP ratio (2) + Seq (7 class)

2

0.125

0.82

0.88

0.81

0.85

S_CaM (85)

CP ratio (1.9) + Seq (7 class)

2

0.125

0.93

0.94

0.93

0.94

S_PKB (69)

CP difference (1.2) + Seq (7 class)

0.5

0.125

0.95

0.99

0.94

0.96

S_GRK (58)

CP difference (0.7)

0.5

0.125

0.94

0.98

0.94

0.96

S_CK1 (56)

CP ratio (1.7)

0.5

0.03125

0.96

0.96

0.96

0.96

S_ATM (54)

CP ratio (2) + Seq (7 class)

8

0.03125

0.96

1.00

0.96

0.98

S_CDC2 (46)

CP ratio (2.2)

0.03125

0.03125

0.95

0.98

0.95

0.96

S_IKK (45)

CP ratio (1.8) + Seq (7 class)

0.5

0.03125

0.90

1.00

0.88

0.94

S_GSK-3 (42)

CP ratio (1.4) + Seq (7 class)

2

0.03125

0.95

0.98

0.95

0.96

S_RSK (39)

CP ratio (1.7)

0.03125

0.03125

0.97

1.00

0.97

0.99

S_PHK (38)

CP ratio (1.1)

0.5

0.03125

0.90

0.97

0.89

0.93

S_Aurora (37)

CP ratio (1.7) + Seq (7 class)

2

0.125

0.97

1.00

0.97

0.98

S_AMPK (35)

CP ratio (1.1)

0.03125

0.03125

0.97

1.00

0.97

0.98

S_PLK1 (33)

CP ratio (1.3) + Seq (7 class)

2

0.125

0.97

1.00

0.97

0.98

S_STK4 (28)

CP ratio (1.7) + Seq (7 class)

0.03125

0.0078125

0.87

0.96

0.86

0.91

S_MAPKAPK2 (26)

CP ratio (1.1) + Seq (7 class)

2

0.03125

0.96

1.00

0.96

0.98

S_PAK1 (21)

CP ratio (1.1) + Seq (7 class)

2

0.03125

1.00

1.00

1.00

1.00

Average

0.89

0.91

0.89

0.90

T_CDK (121)

CP ratio (2.1) + Seq (7 class)

0.5

0.125

0.86

0.98

0.83

0.91

T_MAPK (88)

Seq (20-dimention vector)

0.03125

0.0078125

0.90

0.93

0.90

0.91

T_PKC (71)

CP ratio (2.2) + Seq (7 class)

0.5

0.03125

0.92

0.86

0.92

0.89

T_CK2 (49)

CP ratio (1.4) + Seq (7 class)

2

0.125

0.93

0.98

0.93

0.95

T_PKA (49)

CP ratio (2)

0.5

0.03125

0.93

1.00

0.92

0.96

T_CDC2 (27)

CP ratio (1.1)

2

0.03125

0.96

1.00

0.96

0.98

T_GRK (26)

CP ratio (1.1) + Seq (7 class)

0.03125

0.03125

0.86

0.96

0.85

0.90

T_GSK-3 (23)

CP ratio (1.3) + Seq (7 class)

0.5

0.03125

0.88

1.00

0.87

0.93

T_CaM (21)

CP ratio (1.2) + Seq (7 class)

2

0.03125

0.86

0.86

0.86

0.86

T_ROCK (21)

CP ratio (1.2) + Seq (7 class)

0.5

0.03125

0.95

0.95

0.95

0.95

T_CK1 (20)

CP ratio (1.2) + Seq (7 class)

2

0.03125

1.00

1.00

1.00

1.00

Average

0.92

0.94

0.91

0.93

Y_Src (145)

CP difference (1.1)

0.03125

0.125

0.82

0.86

0.81

0.84

Y_Abl (55)

CP difference (1.1)

0.5

0.125

0.88

0.95

0.87

0.91

Y_EGFR (52)

CP difference (0.9)

0.5

0.125

0.87

1.00

0.84

0.92

Y_Fyn (51)

CP ratio (2)

0.5

0.125

0.80

0.98

0.76

0.87

Y_Lck (49)

CP difference (1)

0.5

0.125

0.85

0.98

0.83

0.91

Y_Syk (49)

CP ratio (1.6)

0.5

0.03125

0.84

0.96

0.82

0.89

Y_INSR (42)

CP ratio (500)

0.03125

0.125

0.90

1.00

0.89

0.95

Y_Lyn (41)

CP ratio (1.8)

0.5

0.125

0.87

0.98

0.85

0.91

Y_JAK2 (35)

CP ratio (1.1)

0.03125

0.03125

0.86

0.97

0.85

0.91

Y_PDGF (25)

CP ratio (1.6) + Seq (7 class)

0.03125

0.03125

1.00

1.00

1.00

1.00

Y_IGF1R (24)

CP ratio (1.1) + Seq (7 class)

0.03125

0.03125

0.94

0.94

0.94

0.94

Y_ALK (22)

CP ratio (2.2)

0.03125

0.125

0.69

0.82

0.64

0.73

Y_EPH (20)

CP ratio (1.2)

0.03125

0.125

1.00

1.00

1.00

1.00

Y_FGFR1 (20)

CP ratio (1.1) + Seq (7 class)

0.03125

0.03125

0.86

0.95

0.85

0.90

Y_ZAP70 (20)

CP ratio (1.4)

0.03125

0.03125

0.95

1.00

0.95

0.98

Average

0.87

0.90

0.86

0.88

H_phosphohistidine (43)

CP ratio (1.6)

2

0.03125

0.91

0.93

0.91

0.93


The best model in each kinase-specific group, which contains less than 20 experimental sites.

Kinase

Trained feature

Cost value

Gamma value

Prec

Sn

Sp

Acc

S_AKT1 (19)

CP ratio (3.9) + Seq (7 class)

0.03125

0.03125

1.00

1.00

1.00

1.00

S_PKG (18)

CP ratio (500) + Seq (7 class)

0.03125

0.03125

0.95

1.00

0.94

0.97

S_IPL1 (16)

CP ratio (1.1) + Seq (7 class)

0.5

0.03125

0.95

1.00

0.95

0.98

S_PDK (14)

CP ratio (1.1)

2

0.5

0.93

1.00

0.93

0.96

S_CHK1 (13)

CP ratio (1.2) + Seq (7 class)

0.03125

0.03125

1.00

1.00

1.00

1.00

S_DNA_PK (13)

CP ratio (1.1) + Seq (7 class)

0.03125

0.03125

1.00

1.00

1.00

1.00

S_PAK2 (13)

CP difference (1.1)

0.03125

0.125

0.86

1.00

0.83

0.92

S_CHK2 (12)

CP ratio (1.2) + Seq (7 class)

0.03125

0.03125

1.00

1.00

1.00

1.00

Average

0.97

1.00

0.97

0.98

T_PDK-1 (19)

CP ratio (1.4)

0.5

0.03125

0.95

1.00

0.94

0.97

T_PKB (17)

CP ratio (1.5) + Seq (7 class)

0.03125

0.03125

1.00

1.00

1.00

1.00

T_LKB1 (14)

CP ratio (1.1) + Seq (7 class)

2

0.125

1.00

1.00

1.00

1.00

T_PLK1 (12)

CP ratio (1.3) + Seq (7 class)

2

0.03125

1.00

1.00

1.00

1.00

T_DAPK (11)

Seq (Blosum62)

0.03125

0.0078125

0.86

0.55

0.91

0.73

Average

0.96

0.93

0.97

0.95

Y_MET (17)

CP ratio (1.1)

0.5

0.03125

0.76

0.94

0.71

0.82

Y_BTK (16)

CP ratio (1.2) + Seq (7 class)

0.03125

0.03125

0.94

0.94

0.94

0.94

Y_TRK (15)

CP ratio (1.2) + Seq (7 class)

2

0.03125

0.88

1.00

0.87

0.93

Y_IR (14)

CP ratio (1.5)

0.03125

0.03125

1.00

1.00

1.00

1.00

Y_FAK (13)

CP ratio (1.1)

0.03125

0.03125

1.00

1.00

1.00

1.00

Y_Ret (13)

CP ratio (1.1) + Seq (7 class)

2

0.125

0.92

0.85

0.92

0.88

Y_CSK (9)

CP ratio (1.5) + Seq (7 class)

2

0.03125

1.00

1.00

1.00

1.00

Y_Fes (9)

CP ratio (1.4) + Seq (7 class)

0.03125

0.03125

0.89

0.89

0.89

0.89

Y_Hck (9)

CP ratio (1.3) + Seq (7 class)

2

0.03125

0.89

0.89

0.89

0.89

Y_Fgr (8)

CP ratio (1.3) + Seq (7 class)

2

0.125

0.90

0.87

0.93

0.90

Y_TYK2 (8)

CP ratio (1.4) + Seq (7 class)

0.03125

0.03125

0.78

0.88

0.75

0.81

Average

0.90

0.90

0.92

0.91



Bid Lab, Institute of Bioinformatics, National Chiao Tung University , Taiwan.
Contact us:bryan@mail.nctu.edu.tw with questions or comments.
Websites:http://KinasePhos2.mbc.nctu.edu.tw/