×

Prediction of protein-protein interaction by metasample-based sparse representation. (English) Zbl 1394.92038

Summary: Protein-protein interactions (PPIs) play key roles in many cellular processes such as transcription regulation, cell metabolism, and endocrine function. Understanding these interactions takes a great promotion to the pathogenesis and treatment of various diseases. A large amount of data has been generated by experimental techniques; however, most of these data are usually incomplete or noisy, and the current biological experimental techniques are always very time-consuming and expensive. In this paper, we proposed a novel method (metasample-based sparse representation classification, MSRC) for PPIs prediction. A group of metasamples are extracted from the original training samples and then use the \(l_1\)-regularized least square method to express a new testing sample as the linear combination of these metasamples. PPIs prediction is achieved by using a discrimination function defined in the representation coefficients. The MSRC is applied to PPIs dataset; it achieves 84.9% sensitivity, and 94. 55% specificity, which is slightly lower than support vector machine (SVM) and much higher than naive Bayes (NB), neural networks (NN), and \(k\)-nearest neighbor (KNN). The result shows that the MSRC is efficient for PPIs prediction.

MSC:

92C40 Biochemistry, molecular biology
62P10 Applications of statistics to biology and medical sciences; meta analysis
68T10 Pattern recognition, speech recognition

Software:

PDCO
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Xia, J.-F.; Han, K.; Huang, D.-S., Sequence-based prediction of protein-protein interactions by means of rotation forest and autocorrelation descriptor, Protein & Peptide Letters, 17, 1, 137-145, (2010) · doi:10.2174/092986610789909403
[2] Uetz, P.; Glot, L.; Cagney, G.; Mansfield, T. A.; Judson, R. S.; Knight, J. R.; Lockshon, D.; Narayan, V.; Srinivasan, M.; Pochart, P.; Qureshi-Emlli, A.; Li, Y.; Godwin, B.; Conover, D.; Kalbfleisch, T.; Vijayadamodar, G.; Yang, M.; Johnston, M.; Fields, S.; Rothberg, J. M., A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae, Nature, 403, 6770, 623-627, (2000) · doi:10.1038/35001009
[3] Ito, T.; Chiba, T.; Ozawa, R.; Yoshida, M.; Hattori, M.; Sakaki, Y., A comprehensive two-hybrid analysis to explore the yeast protein interactome, Proceedings of the National Academy of Sciences of the United States of America, 98, 8, 4569-4574, (2001) · doi:10.1073/pnas.061034498
[4] Gavin, A.-C.; Bösche, M.; Krause, R.; Grandi, P.; Marzioch, M.; Bauer, A.; Schultz, J.; Rick, J. M.; Michon, A.-M.; Cruciat, C.-M.; Remor, M.; Höfert, C.; Schelder, M.; Brajenovic, M.; Ruffner, H.; Merino, A.; Klein, K.; Hudak, M.; Dickson, D.; Rudi, T.; Gnau, V.; Bauch, A.; Bastuck, S.; Huhse, B.; Leutwein, C.; Heurtier, M.-A.; Copley, R. R.; Edelmann, A.; Querfurth, E.; Rybin, V.; Drewes, G.; Raida, M.; Bouwmeester, T.; Bork, P.; Seraphin, B.; Kuster, B.; Neubauer, G.; Superti-Furga, G., Functional organization of the yeast proteome by systematic analysis of protein complexes, Nature, 415, 6868, 141-147, (2002) · doi:10.1038/415141a
[5] Zhu, H.; Bilgin, M.; Bangham, R.; Hall, D.; Casamayor, A.; Bertone, P.; Lan, N.; Jansen, R.; Bidlingmaier, S.; Houfek, T.; Mitchell, T.; Miller, P.; Dean, R. A.; Gerstein, M.; Snyder, M., Global analysis of protein activities using proteome chips, Science, 293, 5537, 2101-2105, (2001) · doi:10.1126/science.1062191
[6] Du, X. Q.; Cheng, J. X.; Zheng, T. T.; Duan, Z.; Qian, F. L., A novel feature extraction scheme with ensemble coding for protein-protein interaction prediction, International Journal of Molecular Sciences, 15, 7, 12731-12749, (2014) · doi:10.3390/ijms150712731
[7] Liu, C. H.; Li, K.-C.; Yuan, S., Human protein-protein interaction prediction by a novel sequence-based co-evolution method: co-evolutionary divergence, Bioinformatics, 29, 1, 92-98, (2013) · doi:10.1093/bioinformatics/bts620
[8] You, Z.-H.; Lei, Y.-K.; Zhu, L.; Xia, J.; Wang, B., Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis, BMC Bioinformatics, 14, 8, article S10, (2013) · doi:10.1186/1471-2105-14-s8-s10
[9] Zahiri, J.; Yaghoubi, O.; Mohammad-Noori, M.; Ebrahimpour, R.; Masoudi-Nejad, A., PPIevo: protein-protein interaction prediction from PSSM based evolutionary information, Genomics, 102, 4, 237-242, (2013) · doi:10.1016/j.ygeno.2013.05.006
[10] Zhang, Q. C.; Petrey, D.; Deng, L.; Qiang, L.; Shi, Y.; Thu, C. A.; Bisikirska, B.; Lefebvre, C.; Accili, D.; Hunter, T.; Maniatis, T.; Califano, A.; Honig, B., Structure-based prediction of protein-protein interactions on a genome-wide scale, Nature, 490, 7421, 556-560, (2012) · doi:10.1038/nature11503
[11] Priya, S. B.; Saha, S.; Anishetty, R.; Anishetty, S., A matrix based algorithm for protein-protein interaction prediction using domain-domain associations, Journal of Theoretical Biology, 326, 36-42, (2013) · Zbl 1322.92011 · doi:10.1016/j.jtbi.2013.02.016
[12] Planas-Iglesias, J.; Bonet, J.; García-García, J.; Marín-López, M. A.; Feliu, E.; Oliva, B., Understanding protein-protein interactions using local structural features, Journal of Molecular Biology, 425, 7, 1210-1224, (2013) · doi:10.1016/j.jmb.2013.01.014
[13] Saha, I.; Zubek, J.; Klingström, T.; Forsberg, S.; Wikander, J.; Kierczak, M.; Maulik, U.; Plewczynski, D., Ensemble learning prediction of protein-protein interactions using proteins functional annotations, Molecular BioSystems, 10, 4, 820-830, (2014) · doi:10.1039/c3mb70486f
[14] Yang, L.; Tang, X., Protein-protein interactions prediction based on iterative clique extension with gene ontology filtering, The Scientific World Journal, 2014, (2014) · doi:10.1155/2014/523634
[15] Souiai, O.; Guerfali, F.; Miled, S. B.; Brun, C.; Benkahla, A., In silico prediction of protein–protein interactions in human macrophages, BMC Research Notes, 7, article 157, (2014) · doi:10.1186/1756-0500-7-157
[16] Martin, S.; Roe, D.; Faulon, J. L., Predicting protein-protein interactions using signature products, Bioinformatics, 21, 2, 218-226, (2005) · doi:10.1093/bioinformatics/bth483
[17] Bock, J. R.; Gough, D. A., Predicting protein-protein interactions from primary structure, Bioinformatics, 17, 5, 455-460, (2001) · doi:10.1093/bioinformatics/17.5.455
[18] Bock, J. R.; Gough, D. A., Whole-proteome interaction mining, Bioinformatics, 19, 1, 125-135, (2003) · doi:10.1093/bioinformatics/19.1.125
[19] Chou, K.-C., Prediction of protein cellular attributes using pseudo-amino acid composition, Proteins: Structure, Function and Genetics, 43, 3, 246-255, (2001) · doi:10.1002/prot.1035
[20] Chou, K.-C., Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes, Bioinformatics, 21, 1, 10-19, (2005) · doi:10.1093/bioinformatics/bth466
[21] Chou, K. C.; Cai, Y. D., Predicting protein-protein interactions from sequences in a hybridization space, Journal of Proteome Research, 5, 2, 316-322, (2006) · doi:10.1021/pr050331g
[22] Guo, Y.; Yu, L.; Wen, Z.; Li, M., Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences, Nucleic Acids Research, 36, 9, 3025-3030, (2008) · doi:10.1093/nar/gkn159
[23] Zhang, S.-W.; Hao, L.-Y.; Zhang, T.-H., Prediction of protein-protein interaction with pairwise kernel support vector machine, International Journal of Molecular Sciences, 15, 2, 3220-3233, (2014) · doi:10.3390/ijms15023220
[24] Chen, S. S.; Donoho, D. L.; Saunders, M. A., Atomic decomposition by basis pursuit, SIAM Review, 43, 1, 129-159, (2001) · Zbl 0979.94010 · doi:10.1137/S003614450037906X
[25] Candes, E. J.; Romberg, J.; Tao, T., Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Transactions on Information Theory, 52, 2, 489-509, (2006) · Zbl 1231.94017 · doi:10.1109/tit.2005.862083
[26] Candes, E. J.; Tao, T., Near-optimal signal recovery from random projections: universal encoding strategies?, IEEE Transactions on Information Theory, 52, 12, 5406-5425, (2006) · Zbl 1309.94033 · doi:10.1109/tit.2006.885507
[27] Donoho, D. L., Compressed sensing, IEEE Transactions on Information Theory, 52, 4, 1289-1306, (2006) · Zbl 1288.94016 · doi:10.1109/tit.2006.871582
[28] Tibshirani, R., Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. Series B. Methodological, 58, 1, 267-288, (1996) · Zbl 0850.62538
[29] Kim, S.-J.; Koh, K.; Lustig, M.; Boyd, S.; Gorinevsky, D., An interior-point method for large-scale l1-regularized least squares, IEEE Journal on Selected Topics in Signal Processing, 1, 4, 606-617, (2007) · doi:10.1109/jstsp.2007.910971
[30] Brunet, J.-P.; Tamayo, P.; Golub, T. R.; Mesirov, J. P., Metagenes and molecular pattern discovery using matrix factorization, Proceedings of the National Academy of Sciences of the United States of America, 101, 12, 4164-4169, (2004) · doi:10.1073/pnas.0308531101
[31] Liebermeister, W., Linear modes of gene expression determined by independent coponent analysis, Bioinformatics, 18, 1, 51-60, (2002) · doi:10.1093/bioinformatics/18.1.51
[32] Hang, X.; Wu, F.-X., Sparse representation for classification of tumors using gene expression data, Journal of Biomedicine and Biotechnology, 2009, (2009) · doi:10.1155/2009/403689
[33] Zheng, C.-H.; Zhang, L.; Ng, T.-Y.; Shiu, C. K.; Huang, D.-S., Metasample-based sparse representation for tumor classification, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 8, 5, 1273-1282, (2011) · doi:10.1109/tcbb.2011.20
[34] Xia, J.-F.; Zhao, X.-M.; Huang, D.-S., Predicting protein-protein interactions from protein sequences using meta predictor, Amino Acids, 39, 5, 1595-1599, (2010) · doi:10.1007/s00726-010-0588-1
[35] Guo, Y. Z.; Yu, L. Z.; Wen, Z. N.; Li, M. L., Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences, Nucleic Acids Research, 36, 9, 3025-3030, (2008) · doi:10.1093/nar/gkn159
[36] Xenarios, I.; Salwínski, Ł.; Duan, X. J.; Higney, P.; Kim, S.-M.; Eisenberg, D., DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions, Nucleic Acids Research, 30, 1, 303-305, (2002) · doi:10.1093/nar/30.1.303
[37] Shen, J.; Zhang, J.; Luo, X.; Zhu, W.; Yu, K.; Chen, K.; Li, Y.; Jiang, H., Predicting protein-protein interactions based only on sequences information, Proceedings of the National Academy of Sciences of the United States of America, 104, 11, 4337-4341, (2007) · doi:10.1073/pnas.0607879104
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.