×

An efficient genomic signature ranking method for genomic island prediction from a single genome. (English) Zbl 1409.92170

Summary: Genomic islands that are associated with microbial adaptations and carry genomic signatures different from that of the host, and thus many methods have been proposed to select the informative genomic signatures from a range of organisms and discriminate genomic islands from the rest of the genome in terms of these signature biases. However, they are of limited use when closely related genomes are unavailable. In the present work, we proposed a kurtosis-based ranking method to select the informative genomic signatures from a single genome. In simulations with alien fragments from artificial and real genomes, the proposed kurtosis-based ranking method efficiently selected the informative genomic signatures from a single genome, without annotated information of genomes or prior knowledge from other datasets. This understanding can be useful to design more powerful method for genomic island detection.

MSC:

92D10 Genetics and epigenetics
62P10 Applications of statistics to biology and medical sciences; meta analysis
62F07 Statistical ranking and selection procedures
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Altschul, S. F.; Madden, T. L.; Schaffer, A. A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D. J., Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res., 25, 17, 3389-3402 (1997)
[2] Arvey, A. J.; Azad, R. K.; Raval, A.; Lawrence, J. G., Detection of genomic islands via segmental genome heterogeneity, Nucleic Acids Res., 37, 16, 5255-5266 (2009)
[3] Azad, R. K.; Lawrence, J. G., Use of artificial genomes in assessing methods for atypical gene detection, Plos Comput. Biol., 1, 6, e56 (2005)
[4] Azad, R. K.; Lawrence, J. G., Towards more robust methods of alien gene detection, Nucleic Acids Res., 39, 9, e56 (2011)
[5] Bertelli, C.; Brinkman, FSL.; Valencia, A., Improved genomic island predictions with IslandPath-DIMOB, Bioinformatics, 34, 13, 2161-2167 (2018)
[6] Chiapello, H.; Bourgait, I.; Sourivong, F.; Heuclin, G.; Gendrault-Jacquemard, A.; M-A, P.; El, K. M., Systematic determination of the mosaic structure of bacterial genomes: species backbone versus strain-specific loops, BMC Bioinform., 6, 171 (2005)
[7] Dai, Q.; Bao, C.; Hai, Y.; Ma, S.; Zhou, T.; Wang, C.; Wang, Y.; Huo, W.; Liu, X.; Yao, Y.; Xuan, Z.; Chen, M.; Zhang, MQ., MTGIpick allows robust identification of genomic islands from a single genome, Brief Bioinform., 19, 361-373 (2018)
[8] Darling, A. C.E.; Mau, B.; Blattner, F. R.; Perna, N. T., Mauve: multiple alignment of conserved genomic sequence with rearrangements, Genome Res., 14, 7, 1394-1403 (2004)
[9] Dhillon, B. K.; Chiu, T. A.; Laird, M. R.; Langille, M. G.I.; Brinkman, F. S.L., IslandViewer update: improved genomic island discovery and visualization, Nucleic Acids Res., 41, W129-W132 (2013)
[10] Dobrindt, U.; Hochhut, B.; Hentschel, U.; Hacker, J., Genomic islands in pathogenic and environmental microorganisms, Nat. Rev. Microbiol., 2, 5, 414-424 (2004)
[11] Egan, J. P., Signal Detection Theory and ROC Analysis (1975), Academic Press
[12] Finlay, B. B.; Falkow, S., Common themes in microbial pathogenicity revisited, Microbiol. Rev., 61, 2, 136-169 (1997)
[13] Finn, R. D.; John, T.; Jaina, M.; Coggill, P. C.; John, S. S.; Hans-Rudolf, H.; Goran, C.; Kristoffer, F.; Eddy, S. R.; Sonnhammer, E. L.L., The Pfam protein families database, Nucleic Acids Res., 36, suppl_1, D281-D288 (2008)
[14] Gal-Mor, O.; Finlay, B. B., Pathogenicity islands: a molecular toolbox for bacterial virulence, Cell. Microbiol., 8, 11, 1707-1719 (2006)
[15] Green, R. E.; Brenner, S. E., Bootstrapping and normalization for enhanced evaluations of pairwise sequence comparison, Proc. IEEE, 90, 12, 1834-1847 (2002)
[16] Hacker, J.; Bender, L.; Ott, M.; Wingender, J.; Lund, B.; Marre, R.; Goebel, W., Deletions of chromosomal regions coding for fimbriae and hemolysins occur in vitro and in vivo in various extraintestinal Escherichia coli isolates, Microb. Pathog., 8, 3, 213-225 (1990)
[17] Hacker, J.; Kaper, J. B., Pathogenicity islands and the evolution of microbes, Ann. Rev. Microbiol., 54, 1, 641-679 (2000)
[18] Hsiao, W.; Wan, I.; Jones, S. J.; Brinkman, F. S., IslandPath: aiding detection of genomic islands in prokaryotes, Bioinformatics, 19, 3, 418-420 (2003)
[19] Hsiao, W. W.; Ung, K.; Aeschliman, D.; Bryan, J.; Finlay, B. B.; Brinkman, F. S., Evidence of a large novel gene pool associated with prokaryotic genomic islands, Plos Genet., 1, 5, e62 (2005)
[20] Jaron, K. S.; Moravec, J. C.; Martínková, N., SigHunt: horizontal gene transfer finder optimized for eukaryotic genomes, Bioinformatics, 30, 8, 1081-1086 (2014)
[21] Karlin, S., Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes, Trends Microbiol., 9, 7, 335-343 (2001)
[22] Karlin, S.; Mrazek, J., Codon usages in different gene classes of the Escherichia coli genome, Mol. Microbiol., 29, 6, 1341-1355 (1998)
[23] Kingsley, R. A.; Van Amsterdam, K.; Kramer, N.; Baumler, A. J., The shdA gene is restricted to serotypes of Salmonella enterica subspecies I and contributes to efficient and prolonged fecal shedding, Infect. Immun., 68, 5, 2720-2727 (2000)
[24] Langille, M. G.; Brinkman, F. S., IslandViewer: an integrated interface for computational identification and visualization of genomic islands, Bioinformatics, 25, 5, 664-665 (2009)
[25] Langille, M. G.; Hsiao, W. W.; Brinkman, F. S., Evaluation of genomic island predictors using a comparative genomics approach, BMC Bioinform., 9, 329 (2008)
[26] Lawrence, J. G., Common themes in the genome strategies of pathogens, Curr. Opin. Genet. Dev., 15, 6, 584-588 (2005)
[27] Li, J.; Tai, C.; Deng, Z.; Zhong, W.; He, Y.; Ou, HY., VRprofile: gene-cluster-detection-based profiling of virulence and antibiotic resistance traits encoded within genome sequences of pathogenic bacteria, Brief Bioinform., 19, 566-574 (2018)
[28] Manson, J. M.; Gilmore, M. S., Pathogenicity island integrase cross-talk: a potential new tool for virulence modulation, Mol. Microbiol., 61, 3, 555-559 (2006)
[29] Middendorf, B.; Hochhut, B.; Leipold, K.; Dobrindt, U.; Blumoehler, G.; Hacker, J., Instability of pathogenicity islands in uropathogenic Escherichia coli 536, J. Bacteriol., 186, 10, 3086-3096 (2004)
[30] Nakamura, Y.; Itoh, T.; Matsuda, H.; Gojobori, T., Biased biological functions of horizontally transferred genes in prokaryotic genomes, Nat. Genet., 36, 7, 760-766 (2004)
[31] Ou, H. Y.; Chen, L. L.; James, L.; Chaudhuri, R. R.; Bin, T. A.; Rebecca, S.; Garton, N. J.; Jay, H.; Mark, P.; Barer, M. R., A novel strategy for the identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites in closely related bacteria, Nucleic Acids Res., 34, 1, e3 (2006)
[32] Ragan, M. A., Detection of lateral gene transfer among microbial genomes, Curr. Opin. Genet. Dev., 11, 6, 620-626 (2001)
[33] Rainer, M., SIGI: score-based identification of genomic islands, BMC Bioinformatics, 5, 22 (2004)
[34] Rajan, I.; Aravamuthan, S.; Mande, S. S., Identification of compositionally distinct regions in genomes using the centroid method, Bioinform., 23, 20, 2672-2677 (2007)
[35] Richter, W. D.; Střelec, L.; Ahmadinezhad, H.; Stehlík, M., Geometric Aspects of Robust Testing for Normality and Sphericity (2017), Stochastic Analysis & Applications · Zbl 1361.62031
[36] Sandberg, R.; Winberg, G.; Bränden, C. I.; Kaske, A.; Ernberg, I.; Cöster, J., Capturing whole-genome characteristics in short sequences using a naïve Bayesian classifier, Genome Res., 11, 8, 1404-1409 (2001)
[37] Shrivastava, S.; Chv, R.; Mande, S. S., INDeGenIUS, a new method for high-throughput identification of specialized functional islands in completely sequenced organisms, J. Biosci., 35, 3, 351-364 (2010)
[38] Stehlík, M.; Střelec, L.; Thulin, M., On robust testing for normality in chemometrics, Chemom. Intell. Lab. Syst., 130, 98-108 (2014)
[39] Tsirigos, A.; Rigoutsos, I., A new computational method for the detection of horizontal gene transfer events, Nucleic Acids Res., 33, 3, 922-933 (2005)
[40] Vernikos, G.; Parkhill, J., Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands, Bioinformatics, 22, 18, 2196-2203 (2006)
[41] Vernikos, G. S.; Parkhill, J., Resolving the structural features of genomic islands: a machine learning approach, Genome Res., 18, 2, 331-342 (2008)
[42] Waack, S.; Keller, O.; Asper, R.; Brodag, T.; Damm, C.; Fricke, W. F.; Surovcik, K.; Meinicke, P.; Merkl, R., Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models, BMC Bioinform., 7, 142 (2006)
[43] Wei, W.; Gao, F.; Du, MZ.; Hua, HL.; Wang, J.; Guo, FB., Zisland Explorer: detect genomic islands by combining homogeneity and heterogeneity properties, Brief Bioinform., 18, 357-366 (2017)
[44] Yoon, S. H.; Hur, C. G.; Kang, H. Y.; Kim, Y. H.; Oh, T. K.; Kim, J. F., A computational approach for identifying pathogenicity islands in prokaryotic genomes, BMC Bioinform., 6, 184 (2005)
[45] Yoon, S. H.; Park, Y. K.; Kim, J. F., PAIDB v2.0: exploration and analysis of pathogenicity and resistance islands, Nucleic Acids Res., 43, D624-D630 (2015)
[46] Yoon, S. H.; Park, Y. K.; Lee, S.; Choi, D.; Oh, T. K.; Hur, C. G.; Kim, J. F., Towards pathogenomics: a web-based resource for pathogenicity islands, Nucleic Acids Res., 35, Database, D395-D400 (2007)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.