Prediction of protein Post-Translational Modification sites: An overview

Md. Mehedi Hasan* and Mst. Shamima Khatun

Published: 02 March, 2018 | Volume 2 - Issue 1 | Pages: 049-057

Post-translational modification (PTM) refers to the covalent and enzymatic modification of proteins during or after protein biosynthesis. In the protein biosynthesis process, the ribosomal mRNA is translated into polypeptide chains, which may further undergo PTM to form the product of mature protein [1]. PTM is a common biological mechanism of both eukaryotic and prokaryotic organisms, which regulates the protein functions, the proteolytic cleavage of regulatory subunits or the degradation of entire proteins and affects all aspects of cellular life. The PTM of a protein can also determine the cell signaling state, turnover, localization, and interactions with other proteins [2]. Therefore, the analysis of proteins and their PTMs are particularly important for the study of heart disease, cancer, neurodegenerative diseases and diabetes [3,4]. Although the characterization of PTMs gets invaluable insight into the cellular functions in etiological processes, there are still challenges. Technically, the major challenges in studying PTMs are the development of specific detection and purification methods.

Read Full Article HTML DOI: 10.29328/journal.apb.1001005 Cite this Article Read Full Article PDF


  1. Knorre DG, Kudryashova NV, Godovikova TS. Chemical and functional aspects of posttranslational modification of proteins. Acta Naturae. 2009; 1: 29-51. Ref.: https://goo.gl/bHviVJ
  2. Xie L, Liu W, Li Q, Chen S, Xu M, et al. First succinyl-proteome profiling of extensively drug-resistant Mycobacterium tuberculosis revealed involvement of succinylation in cellular physiology. J Proteome Res. 2015; 14: 107-119. Ref.: https://goo.gl/7JwQLd
  3. Yang M, Yang J, Zhang Y, Zhang W. Influence of succinylation on physicochemical property of yak casein micelles. Food Chem. 2016; 190: 836-842. Ref.: https://goo.gl/eqErGv
  4. Rohira AD, Chen CY, Allen JR, Johnson DL. Covalent small ubiquitin-like modifier (SUMO) modification of Maf1 protein controls RNA polymerase III-dependent transcription repression. J Biol Chem. 2013; 288: 19288-19295. Ref.: https://goo.gl/WG8vq3
  5. Medzihradszky KF. Peptide sequence analysis. Methods Enzymol. 2005; 402: 209-244. Ref.: https://goo.gl/9Kfp94
  6. Agarwal KL, Kenner GW, Sheppard RC. Feline gastrin. An example of peptide sequence analysis by mass spectrometry. J Am Chem Soc. 1969; 91: 3096-3097. Ref.: https://goo.gl/tck65Z
  7. Welsch DJ, Nelsestuen GL. Amino-terminal alanine functions in a calcium-specific process essential for membrane binding by prothrombin fragment 1. Biochemistry. 1988; 27: 4939-4945. Ref.: https://goo.gl/FwgX1a
  8. Slade DJ, Subramanian V, Fuhrmann J, Thompson PR. Chemical and biological methods to detect post-translational modifications of arginine. Biopolymers. 2014; 101: 133-143. Ref.: https://goo.gl/qBW8uZ
  9. Umlauf D, Goto Y, Feil R. Site-specific analysis of histone methylation and acetylation. Methods Mol Biol, 2004; 287: 99-120. Ref.: https://goo.gl/zjNS6r
  10. Jaffrey SR, Erdjument-Bromage H, Ferris CD, Tempst P, Snyder SH. Protein S-nitrosylation: a physiological signal for neuronal nitric oxide. Nat Cell Biol. 2001; 3: 193-197. Ref.: https://goo.gl/q2hteS
  11. Doll S, Burlingame AL. Mass spectrometry-based detection and assignment of protein posttranslational modifications. ACS Chem Biol. 2015; 10: 63-71. Ref.: https://goo.gl/fZ5uQy
  12. Richards AL, Hebert AS, Ulbrich A, Bailey DJ, Coughlin EE, et al. One-hour proteome analysis in yeast. Nat Protoc. 2015; 10: 701-714. Ref.: https://goo.gl/NjFpTb
  13. Hebert AS, Richards AL, Bailey DJ, Ulbrich A, Coughlin EE, et al. The one hour yeast proteome. Mol Cell Proteomics. 2014; 13: 339-347. Ref.: https://goo.gl/WsZKTg
  14. Imamura H, Sugiyama N, Wakabayashi M, Ishihama Y. Large-scale identification of phosphorylation sites for profiling protein kinase selectivity. J Proteome Res. 2014;13: 3410-3419. Ref.: https://goo.gl/1uM654
  15. Masuda T, Sugiyama N, Tomita M, Ishihama Y. Microscale phosphoproteome analysis of 10,000 cells from human cancer cell lines. Anal Chem. 2011; 83: 7698-7703. Ref.: https://goo.gl/3dc9dM
  16. Trinidad JC, Barkan DT, Gulledge BF, Thalhammer A, Sali A, et al. Global identification and characterization of both O-GlcNAcylation and phosphorylation at the murine synapse. Mol Cell Proteomics. 2012; 11: 215-229. Ref.: https://goo.gl/ceuTj1
  17. Olsen JV, Vermeulen M, Santamaria A, Kumar C, Miller ML, et al. Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis. Sci Signal. 2010; 3: ra3. Ref.: https://goo.gl/L9ss6F
  18. Choudhary C, Kumar C, Gnad F, Nielsen ML, Rehman M, et al. Lysine acetylation targets protein complexes and co-regulates major cellular functions. Science. 2009; 325: 834-840. Ref.: https://goo.gl/Aju8io
  19. Kim W, Bennett EJ, Huttlin EL, Guo A, Li J, et al. Systematic and quantitative assessment of the ubiquitin-modified proteome. Mol Cell. 2011; 44: 325-340. Ref.: https://goo.gl/a4ADaR
  20. Hendriks IA, D'Souza RC, Yang B, Verlaan-de Vries M, Mann M, et al. Uncovering global SUMOylation signaling networks in a site-specific manner. Nat Struct Mol Biol. 2014; 21: 927-936. Ref.: https://goo.gl/HZn2sq
  21. Syka JE, Coon JJ, Schroeder MJ, Shabanowitz J, Hunt DF. Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc Natl Acad Sci U S A. 2004;101: 9528-9533. Ref.: https://goo.gl/wSMjGt
  22. Myers SA, Daou S, Affar el B, Burlingame A. Electron transfer dissociation (ETD): the mass spectrometric breakthrough essential for O-GlcNAc protein site assignments-a study of the O-GlcNAcylated protein host cell factor C1. Proteomics. 2013; 13: 982-991. Ref.: https://goo.gl/nm45xC
  23. Ramstrom M, Sandberg H. Characterization of gamma-carboxylated tryptic peptides by collision-induced dissociation and electron transfer dissociation mass spectrometry. Eur J Mass Spectrom (Chichester, Eng). 2011; 17: 497-506. Ref.: https://goo.gl/XouSno
  24. Moremen KW, Tiemeyer M, Nairn AV. Vertebrate protein glycosylation: diversity, synthesis and function. Nat Rev Mol Cell Biol. 2012; 13: 448-462. Ref.: https://goo.gl/qxaWhh
  25. Han X, Yang K, Gross RW. Multi-dimensional mass spectrometry-based shotgun lipidomics and novel strategies for lipidomic analyses. Mass Spectrom Rev. 2012; 31: 134-178. Ref.: https://goo.gl/fkeRkS
  26. Tan M, Peng C, Anderson KA, Chhoy P, Xie Z, et al. Lysine glutarylation is a protein posttranslational modification regulated by SIRT5. Cell Metab. 2014; 19: 605-617. Ref.: https://goo.gl/jYHNdT
  27. Basu A, Rose KL, Zhang J, Beavis RC, Ueberheide B, et al. Proteome-wide prediction of acetylation substrates. Proc Natl Acad Sci U S A. 2009; 106: 13785-13790. Ref.: https://goo.gl/iRi8D7
  28. Striebel F, Imkamp F, Sutter M, Steiner M, Mamedov A, et al. Bacterial ubiquitin-like modifier Pup is deamidated and conjugated to substrates by distinct but homologous enzymes. Nat Struct Mol Biol. 2009; 16: 647-651. Ref.: https://goo.gl/YD2Y8P
  29. DeMartino GN. PUPylation: something old, something new, something borrowed, something Glu. Trends Biochem Sci. 2009; 34: 155-158. Ref.: https://goo.gl/XGN8T3
  30. Passerini A, Punta M, Ceroni A, Rost B, Frasconi P. Identifying cysteines and histidines in transition-metal-binding sites using support vector machines and neural networks. Proteins. 2006; 65: 305-316. Ref.: https://goo.gl/BnZ38n
  31. Youn E, Peters B, Radivojac P, Mooney SD. Evaluation of features for catalytic residue prediction in novel folds. Protein Sci. 2007; 16: 216-226. Ref.: https://goo.gl/Xrxuto
  32. Sharma A, Rastogi T, Bhartiya M, Shasany AK, Khanuja SP. Type 2 diabetes mellitus: phylogenetic motifs for predicting protein functional sites. J Biosci. 2007; 32: 999-1004. Ref.: https://goo.gl/KhffLS
  33. Vandermarliere E, Martens L. Protein structure as a means to triage proposed PTM sites. Proteomics. 2013; 13: 1028-1035. Ref.: https://goo.gl/npNYGF
  34. Ren J, Wen L, Gao X, Jin C, Xue Y, et al. CSS-Palm 2.0: an updated software for palmitoylation sites prediction. Protein Eng Des Sel. 2008; 21: 639-644. Ref.: https://goo.gl/8qJhj2
  35. Liu Z, Cao J, Ma Q, Gao X, Ren J, et al. GPS-YNO2: computational prediction of tyrosine nitration sites in proteins. Mol Biosyst. 2011; 7: 1197-1204. Ref.: https://goo.gl/h1nSr8
  36. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997; 25: 3389-3402. Ref.: https://goo.gl/QDHQR3
  37. Hasan MM, Khatun MS. Recent progress and challenges for protein pupylation sites prediction. EC Proteomics and Bioinformatics. 2017; 2.1: 36-45.
  38. Hasan MM, Zhou Y, Lu X, Li J, Song J, et al. Computational Identification of Protein Pupylation Sites by Using Profile-Based Composition of k-Spaced Amino Acid Pairs. PLoS One. 2015; 10: e0129635. Ref.: https://goo.gl/nENNxR
  39. Gobel U, Sander C, Schneider R, Valencia A. Correlated mutations and residue contacts in proteins. Proteins. 1994;18: 309-317. Ref.: https://goo.gl/7nnsc4
  40. Lockless SW, Ranganathan R. Evolutionarily conserved pathways of energetic connectivity in protein families. Science. 1999; 286: 295-299. Ref.: https://goo.gl/gkajNd
  41. Dekker JP, Fodor A, Aldrich RW, Yellen G. A perturbation-based method for calculating explicit likelihood of evolutionary co-variance in multiple sequence alignments. Bioinformatics. 2004; 20: 1565-1572. Ref.: https://goo.gl/vpaeS8
  42. Hasan MM, Khatun MS, Mollah MNH, Yong C, Guo D. A systematic identification of species-specific protein succinylation sites using joint element features information. Int J Nanomedicine. 2017; 12: 6303-6315. Ref.: https://goo.gl/KP5B9P
  43. Halperin I, Glazer DS, Wu S, Altman RB. The FEATURE framework for protein function annotation: modeling new functions, improving performance, and extending to novel applications. BMC Genomics. 2008; 9 Suppl 2: S2. Ref.: https://goo.gl/QJMzEc
  44. Mooney SD, Liang MH, DeConde R, Altman RB. Structural characterization of proteins using residue environments. Proteins. 2005; 61: 741-747. Ref.: https://goo.gl/okAL7j
  45. Amitai G, Shemesh A, Sitbon E, Shklar M, Netanely D, et al. Network analysis of protein structures identifies functional residues. J Mol Biol. 2004; 344: 1135-1146. Ref.: https://goo.gl/sTTkh1
  46. Rani P, Pudi V. RBNBC: Repeat Based Naive Bayes Classifier for Biological Sequences. Icdm 2008: Eighth Ieee International Conference on Data Mining, 2008; Proceedings: 989-994.
  47. David J. Hand KY. Idiot's Bayes: Not So Stupid after All? International Statistical Review /Revue Internationale de Statistique, 2001; 69: 385-398.
  48. Shao J, Xu D, Tsai SN, Wang Y, Ngai SM. Computational identification of protein methylation sites through bi-profile Bayes feature extraction. PLoS One. 2009; 4: e4920. Ref.: https://goo.gl/KPoSNi
  49. Zhang SW, Pan Q, Zhang HC, Shao ZC, Shi JY. Prediction of protein homo-oligomer types by pseudo amino acid composition: Approached with an improved feature extraction and Naive Bayes Feature Fusion. Amino Acids. 2006; 30: 461-468. Ref.: https://goo.gl/o9AG12
  50. Sheppard S, Lawson ND, Zhu LJ. Accurate identification of polyadenylation sites from 3' end deep sequencing using a naive Bayes classifier. Bioinformatics. 2013; 29: 2564-2571. Ref.: https://goo.gl/tNVeZn
  51. Yang P, Humphrey SJ, Fazakerley DJ, Prior MJ, Yang G, et al. Re-fraction: a machine learning approach for deterministic identification of protein homologues and splice variants in large-scale MS-based proteomics. J Proteome Res. 2012; 11: 3035-3045. Ref.: https://goo.gl/MyCAHJ
  52. Simon P. Too Big to Ignore: The Business Case for Big Data. Wiley, 2013; 89.
  53. Breiman L. Random Forests. Machine Learning, 2001; 45: 5-32. Ref.: https://goo.gl/9rqw7o
  54. Maclin R, Opitz D. Popular ensemble methods: An empirical study. Journal of Artificial Intelligence Research. 1999; 11: 169-198. Ref.: https://goo.gl/ugm7T4
  55. Polikar R. Ensemble based systems in decision making. Circuits and systems magazine, IEEE. 2006; 6: 21-45. Ref.: https://goo.gl/GAnEij
  56. Rokach L. Ensemble-based classifiers. Artificial Intelligence Review. 2010; 33: 1-39. Ref.: https://goo.gl/naMCA5
  57. Brown G, Wyatt J, Harris R, Yao X. Diversity creation methods: a survey and categorisation. Information Fusion. 2005; 6: 5-20. Ref.: https://goo.gl/ABKNwa
  58. Adeva JJG, Beresi U, Calvo R. Accuracy and diversity in ensembles of text categorisers. CLEI Electronic Journal. 2005; 9: 1-12. Ref.: https://goo.gl/c3vzuR
  59. Liu ZP, Wu LY, Wang Y, Zhang XS, Chen L. Prediction of protein-RNA binding sites by a random forest method with combined features. Bioinformatics. 2010; 26: 1616-1622. Ref.: https://goo.gl/TQHQRE
  60. Kumar KK, Pugalenthi G, Suganthan PN. DNA-Prot: identification of DNA binding proteins from protein sequence information using random forest. J Biomol Struct Dyn. 2009; 26: 679-686. Ref.: https://goo.gl/gXLBHT
  61. Qi Y, Klein-Seetharaman J, Bar-Joseph Z. Random forest similarity for protein-protein interaction prediction from multiple sources. Pac Symp Biocomput. 2005; 531-542. Ref.: https://goo.gl/kU7VD1
  62. Hasan MM, Guo D, Kurata H. Computational identification of protein S-sulfenylation sites by incorporating the multiple sequence features information. Mol Biosyst. 2017; 13: 2545-2550. Ref.: https://goo.gl/JhMKEE
  63. Hasan MM, Yang S, Zhou Y, Mollah MN SuccinSite: a computational tool for the prediction of protein succinylation sites by exploiting the amino acid patterns and properties. Mol Biosyst, 2016; 12: 786-795. Ref.: https://goo.gl/Zezfm1
  64. Cornia C, Vapnik V. Support-vector networks. Machine Learning. 1995; 20: 273-297. Ref.: https://goo.gl/RE4bJo
  65. Chang CC. LIBSVM: A Library for Support Vector Machines. ACM transactions on intelligent systems and technology. 2011; 2. Ref.: https://goo.gl/Jx29pP
  66. Pavlidis P, Wapinski I, Noble WS. Support vector machine classification on the web. Bioinformatics. 2004; 20: 586-587. Ref.: https://goo.gl/guqAUu
  67. Frank E, Hall M, Trigg L, Holmes G, Witten IH. Data mining in bioinformatics using Weka. Bioinformatics. 2004; 20: 2479-2481. Ref.: https://goo.gl/QQdQtq
  68. Chen X, Qiu JD, Shi SP, Suo SB, Liang RP. Systematic analysis and prediction of pupylation sites in prokaryotic proteins. PLoS One. 2013; 8: e74002. Ref.: https://goo.gl/h8t9mH
  69. Tung CW. Prediction of pupylation sites using the composition of k-spaced amino acid pairs. J Theor Biol. 2013; 336: 11-17. Ref.: https://goo.gl/AhZmz8
  70. Wu S, Zhang Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. Bioinformatics. 2008; 24: 924-931. Ref.: https://goo.gl/BsZmRP
  71. Yan RX, Si JN, Wang C, Zhang Z. DescFold: a web server for protein fold recognition. BMC Bioinformatics. 2009; 10: 416. Ref.: https://goo.gl/NaWMFM
  72. Guo J, Chen H, Sun Z, Lin Y. A novel method for protein secondary structure prediction using dual-layer SVM and profiles. Proteins. 2004; 54: 738-743. Ref.: https://goo.gl/hNVe7r
  73. Minsky MSP. An Introduction to Computational Geometry. 1969; ISBN 0-262-63022-2.
  74. Fukushima K. Cognitron: a self-organizing multilayered neural network. Biol Cybern, 1975; 20: 121-136. Ref.: https://goo.gl/hzsy1e
  75. Tang YR, Chen YZ, Canchaya CA, Zhang Z. GANNPhos: a new phosphorylation site predictor based on a genetic algorithm integrated neural network. Protein Eng Des Sel. 2007; 20: 405-412. Ref.: https://goo.gl/GJH3G8
  76. Blom N, Sicheritz-Ponten T, Gupta R, Gammeltoft S, Brunak S. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics. 2004; 4: 1633-1649. Ref.: https://goo.gl/dGmYaQ
  77. Dehouck Y, Grosfils A, Folch B, Gilis D, Bogaerts P, et al. Fast and accurate predictions of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC-2.0. Bioinformatics. 2009; 25: 2537-2543. Ref.: https://goo.gl/BhKBfr
  78. Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol. 1999; 292: 195-202. Ref.: https://goo.gl/nUkouC
  79. McGuffin LJ, Bryson K, Jones DT. The PSIPRED protein structure prediction server. Bioinformatics. 2000; 16: 404-405. Ref.: https://goo.gl/UW6fu4
  80. Bienkowska JR, Dalgin GS, Batliwalla F, Allaire N, Roubenoff R, et al. Convergent Random Forest predictor: methodology for predicting drug response from genome-scale data applied to anti-TNF response. Genomics. 2009; 94: 423-432. Ref.: https://goo.gl/55hyK


Figure 1

Figure 1

Figure 1

Figure 2

Similar Articles

Recently Viewed

Read More

Most Viewed

Read More

Help ?