skip to main content
research-article

Intelligent Classification and Analysis of Essential Genes Using Quantitative Methods

Published:17 April 2020Publication History
Skip Abstract Section

Abstract

Essential genes are considered to be the genes required to sustain life of different organisms. These genes encode proteins that maintain central metabolism, DNA replications, translation of genes, and basic cellular structure, and mediate the transport process within and out of the cell. The identification of essential genes is one of the essential problems in computational genomics. In this present study, to discriminate essential genes from other genes from a non-biologists perspective, the purine and pyrimidine distribution over the essential genes of four exemplary species, namely Homo sapiens, Arabidopsis thaliana, Drosophila melanogaster, and Danio rerio are thoroughly experimented using some quantitative methods. Moreover, the Indigent classification method has also been deployed for classification on the essential genes of the said species. Based on Shannon entropy, fractal dimension, Hurst exponent, and purine and pyrimidine bases distribution, 10 different clusters have been generated for the essential genes of the four species. Some proximity results are also reported herewith for the clusters of the essential genes.

References

  1. Ryan S. O’Neill and Denise V. Clark. 2013. The Drosophila melanogaster septin gene Sep2 has a redundant function with the retrogene Sep5 in imaginal cell proliferation but is essential for oogenesis. Genome 56, 12 (2013), 753--758.Google ScholarGoogle ScholarCross RefCross Ref
  2. Yipin Wu, Michel Baum, Chou-Long Huang, and Aylin R. Rodan. 2015. Two inwardly rectifying potassium channels, Irk1 and Irk2, play redundant roles in Drosophila renal tubule function. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology 309, 7 (2015), R747–R756.Google ScholarGoogle ScholarCross RefCross Ref
  3. Eugene V. Koonin. 2003. Comparative genomics, minimal gene-sets and the last universal common ancestor. Nature Reviews Microbiology 1, 2 (2003), 127.Google ScholarGoogle ScholarCross RefCross Ref
  4. Eugene V. Koonin. 2000. How many genes can make a cell: The minimal-gene-set concept. Annual Review of Genomics and Human Genetics 1, 1 (2000), 99--116.Google ScholarGoogle ScholarCross RefCross Ref
  5. Fabian M. Commichau, Nico Pietack, and Jörg Stülke. 2013. Essential genes in Bacillus subtilis: A re-evaluation after ten years. Molecular BioSystems 9, 6 (2013), 1068--1075.Google ScholarGoogle ScholarCross RefCross Ref
  6. Mitsuhiro Itaya. 1995. An estimation of minimal genome size required for life. FEBS Letters 362, 3 (1995), 257--260.Google ScholarGoogle ScholarCross RefCross Ref
  7. Ren Zhang and Yan Lin. 2008. DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Research 37, suppl. 1 (2008), D455–D458.Google ScholarGoogle Scholar
  8. Lars M. Steinmetz, Curt Scharfe, Adam M. Deutschbauer, Dejana Mokranjac, Zelek S. Herman, Ted Jones, Angela M. Chu, et al. 2002. Systematic screen for human disease genes in yeast. Nature Genetics 31, 4 (2002), 400.Google ScholarGoogle ScholarCross RefCross Ref
  9. Gyanu Lamichhane, Matteo Zignol, Natalie J. Blades, Deborah E. Geiman, Annette Dougherty, Jacques Grosset, Karl W. Broman, and William R. Bishai. 2003. A postgenomic method for predicting essential genes at subsaturation levels of mutagenesis: Application to Mycobacterium tuberculosis. Proceedings of the National Academy of Sciences of the United States of America 100, 12 (2003), 7213--7218.Google ScholarGoogle ScholarCross RefCross Ref
  10. Wenqi Hu, Susan Sillaots, Sebastien Lemieux, John Davison, Sarah Kauffman, Anouk Breton, Annie Linteau, et al. 2007. Essential gene identification and drug target prioritization in Aspergillus fumigatus. PLoS Pathogens 3, 3 (2007), e24.Google ScholarGoogle ScholarCross RefCross Ref
  11. Terry Roemer, Bo Jiang, John Davison, Troy Ketela, Karynn Veillette, Anouk Breton, Fatou Tandia, et al. 2003. Large-scale essential gene identification in Candida albicans and applications to antifungal drug discovery. Molecular Microbiology 50, 1 (2003), 167--181.Google ScholarGoogle ScholarCross RefCross Ref
  12. Scott A. Becker and Bernhard Ø. Palsson. 2005. Genome-scale reconstruction of the metabolic network in Staphylococcus aureus N315: An initial draft to the two-dimensional annotation. BMC Microbiology 5, 1 (2005), 8.Google ScholarGoogle ScholarCross RefCross Ref
  13. Guri Giaever, Angela M. Chu, Li Ni, Carla Connelly, Linda Riles, Steeve Veronneau, Sally Dow, et al. 2002. Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 6896 (2002), 387.Google ScholarGoogle Scholar
  14. Yu Chen and Dong Xu. 2004. Understanding protein dispensability through machine-learning analysis of high-throughput data. Bioinformatics 21, 5 (2004), 575--581.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Jens Harborth, Sayda M. Elbashir, Kim Bechert, Thomas Tuschl, and Klaus Weber. 2001. Identification of essential genes in cultured mammalian cells using small interfering RNAs. Journal of Cell Science 114, 24 (2001), 4557--4565.Google ScholarGoogle ScholarCross RefCross Ref
  16. Yinduo Ji, Barbara Zhang, Stephanie F. Van, Patrick Warren, Gary Woodnutt, Martin K. R. Burnham, Martin Rosenberg, et al. 2001. Identification of critical staphylococcal genes using conditional phenotypes generated by antisense RNA. Science 293, 5538 (2001), 2266--2269.Google ScholarGoogle Scholar
  17. Larry A. Gallagher, Elizabeth Ramage, Michael A. Jacobs, Rajinder Kaul, Mitchell Brittnacher, and Colin Manoil. 2007. A comprehensive transposon mutant library of Francisella novicida, a bioweapon surrogate. Proceedings of the National Academy of Sciences of the United States of America 104, 3 (2007), 1009--1014.Google ScholarGoogle ScholarCross RefCross Ref
  18. Gemma C. Langridge, Minh-Duy Phan, Daniel J. Turner, Timothy T. Perkins, Leopold Parts, Jana Haase, Ian Charles, et al. 2009. Simultaneous assay of every Salmonella Typhi gene using one million transposon mutants. Genome Research 19, 12 (2009), 2308--2316.Google ScholarGoogle ScholarCross RefCross Ref
  19. Ranjeet Kumar Rout, Pabitra Pal Choudhury, Santi Prasad Maity, B. S. Daya Sagar, and Sk Sarif Hassan. 2018. Fractal and mathematical morphology in intricate comparison between tertiary protein structures. Computer Methods in Biomechanics and Biomedical Engineering: Imaging 8 Visualization 6, 2 (2018), 192--203.Google ScholarGoogle Scholar
  20. Kang Ning, Hoong Kee Ng, Sriganesh Srihari, Hon Wai Leong, and Alexey I. Nesvizhskii. 2010. Examination of the relationship between essential genes in PPI network and hub proteins in reverse nearest neighbor topology. BMC Bioinformatics 11, 1 (2010), 505.Google ScholarGoogle ScholarCross RefCross Ref
  21. Yuan-Nong Ye, Zhi-Gang Hua, Jian Huang, Nini Rao, and Feng-Biao Guo. 2013. CEG: A database of essential gene clusters. BMC Genomics 14, 1 (2013), 769.Google ScholarGoogle ScholarCross RefCross Ref
  22. Yao Lu, Jingyuan Deng, Matthew B. Carson, Hui Lu, and Long J. Lu. 2014. Computational methods for the prediction of microbial essential genes. Current Bioinformatics 9, 2 (2014), 89--101.Google ScholarGoogle ScholarCross RefCross Ref
  23. Ping Xu, Xiuchun Ge, Lei Chen, Xiaojing Wang, Yuetan Dou, Jerry Z. Xu, Jenishkumar R. Patel, et al. 2011. Genome-wide essential gene identification in Streptococcus sanguinis. Scientific Reports 1 (2011), 125.Google ScholarGoogle ScholarCross RefCross Ref
  24. Sergei Maslov and Kim Sneppen. 2002. Protein interaction networks beyond artifacts. FEBS Letters 530, 1–3 (2002), 255--256.Google ScholarGoogle ScholarCross RefCross Ref
  25. Hawoong Jeong, Sean P. Mason, A.-L. Barabási, and Zoltan N. Oltvai. 2001. Lethality and centrality in protein networks. Nature 411, 6833 (2001), 41.Google ScholarGoogle Scholar
  26. Haiyuan Yu, Dov Greenbaum, Hao Xin Lu, Xiaowei Zhu, and Mark Gerstein. 2004. Genomic analysis of essentiality within protein networks. TRENDS in Genetics 20, 6 (2004), 227--231.Google ScholarGoogle ScholarCross RefCross Ref
  27. Balázs Papp, Csaba Pál, and Laurence D. Hurst. 2004. Metabolic network analysis of the causes and evolution of enzyme dispensability in yeast. Nature 429, 6992 (2004), 661.Google ScholarGoogle Scholar
  28. Felipe Sarmiento, Jan Mrázek, and William B. Whitman. 2013. Genome-scale analysis of gene function in the hydrogenotrophic methanogenic archaeon Methanococcus maripaludis. Proceedings of the National Academy of Sciences of the United States of America 110, 12 (2013), 4726--4731.Google ScholarGoogle ScholarCross RefCross Ref
  29. Guri Giaever, Angela M. Chu, Li Ni, Carla Connelly, Linda Riles, Steeve Veronneau, Sally Dow, et al. 2002. Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 6896 (2002), 387.Google ScholarGoogle Scholar
  30. Dong-Uk Kim, Jacqueline Hayles, Dongsup Kim, Valerie Wood, Han-Oh Park, Misun Won, Hyang-Sook Yoo, et al. 2010. Analysis of a genome-wide set of gene deletions in the fission yeast Schizosaccharomyces pombe. Nature Biotechnology 28, 6 (2010), 617.Google ScholarGoogle ScholarCross RefCross Ref
  31. David Meinke, Rosanna Muralla, Colleen Sweeney, and Allan Dickerman. 2008. Identifying essential genes in Arabidopsis thaliana. Trends in Plant Science 13, 9 (2008), 483--491.Google ScholarGoogle ScholarCross RefCross Ref
  32. Ben-Yang Liao and Jianzhi Zhang. 2007. Mouse duplicate genes are as essential as singletons. Trends in Genetics 23, 8 (2007), 378--381.Google ScholarGoogle ScholarCross RefCross Ref
  33. Vincent A. Blomen, Peter Májek, Lucas T. Jae, Johannes W. Bigenzahn, Joppe Nieuwenhuis, Jacqueline Staring, Roberto Sacco, et al. 2015. Gene essentiality and synthetic lethality in haploid human cells. Science 350, 6264 (2015), 1092--1096.Google ScholarGoogle Scholar
  34. Tim Wang, Kıvanç Birsoy, Nicholas W. Hughes, Kevin M. Krupczak, Yorick Post, Jenny J. Wei, Eric S. Lander, and David M. Sabatini. 2015. Identification and characterization of essential genes in the human genome. Science 350, 6264 (2015), 1096--1101.Google ScholarGoogle Scholar
  35. L. W. Ning, H. Lin, H. Ding, J. Huang, N. Rao, and F. B. Guo. 2014. Predicting bacterial essential genes using only sequence composition information. Genetics and Molecular Research 13 (2014), 4564--4572.Google ScholarGoogle ScholarCross RefCross Ref
  36. Yongming Yu, Licai Yang, Zhiping Liu, and Chuansheng Zhu. 2017. Gene essentiality prediction based on fractal features and machine learning. Molecular BioSystems 13, 3 (2017), 577--584.Google ScholarGoogle ScholarCross RefCross Ref
  37. Kitiporn Plaimas, Roland Eils, and Rainer König. 2010. Identifying essential genes in bacterial metabolic networks with machine learning methods. BMC Systems Biology 4, 1 (2010), 56.Google ScholarGoogle ScholarCross RefCross Ref
  38. Marcio L. Acencio and Ney Lemke. 2009. Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information. BMC Bioinformatics 10, 1 (2009), 290.Google ScholarGoogle ScholarCross RefCross Ref
  39. Yao Lu, Jingyuan Deng, Judith C. Rhodes, Hui Lu, and Long Jason Lu. 2014. Predicting essential genes for identifying potential drug targets in Aspergillus fumigatus. Computational Biology and Chemistry 50 (2014), 29--40.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Jian Cheng, Zhao Xu, Wenwu Wu, Li Zhao, Xiangchen Li, Yanlin Liu, and Shiheng Tao. 2014. Training set selection for the prediction of essential genes. PloS One 9, 1 (2014), e86805.Google ScholarGoogle ScholarCross RefCross Ref
  41. Xiao Liu, Bao-Jin Wang, Luo Xu, Hong-Ling Tang, and Guo-Qing Xu. 2017. Selection of key sequence-based features for prediction of essential genes in 31 diverse bacterial species. PloS One 12, 3 (2017), e0174638.Google ScholarGoogle Scholar
  42. John A. Hartigan and Manchek A. Wong. 1979. Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics) 28, 1 (1979), 100--108.Google ScholarGoogle ScholarCross RefCross Ref
  43. Carlo Cattani. 2010. Fractals and hidden symmetries in DNA. Mathematical Problems in Engineering 2010 (2010), Article 507506.Google ScholarGoogle Scholar
  44. Jayanta Kumar Das, Pabitra Pal Choudhury, Adwitiya Chaudhuri, Sk Sarif Hassan, and Pallab Basu. 2018. Analysis of purines and pyrimidines distribution over miRNAs of human, gorilla, chimpanzee, mouse and rat. Scientific Reports 8, 1 (2018), 9974.Google ScholarGoogle ScholarCross RefCross Ref
  45. Cheryl L. Berthelsen, James A. Glazier, and Mark H. Skolnick. 1992. Global fractal dimension of human DNA sequences treated as pseudorandom walks. Physical Review A 45, 12 (1992), 8902.Google ScholarGoogle ScholarCross RefCross Ref
  46. Konstantin Makarychev, Yury Makarychev, Andrei Romashchenko, and Nikolai Vereshchagin. 2002. A new class of non-Shannon-type inequalities for entropies. Communications in Information and Systems 2, 2 (2002), 147;166.Google ScholarGoogle Scholar
  47. Wojciech H. Zurek. 1989. Algorithmic randomness and physical entropy. Physical Review A 40, 8 (1989), 4731.Google ScholarGoogle ScholarCross RefCross Ref
  48. Ty Roach, James Nulton, Paolo Sibani, Forest Rohwer, and Peter Salamon. 2017. Entropy in the tangled nature model of evolution. Entropy 19, 5 (2017), 192.Google ScholarGoogle ScholarCross RefCross Ref
  49. Arcady R. Mushegian and Eugene V. Koonin. 1996. A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proceedings of the National Academy of Sciences 93, 19 (1996), 10268--10273.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Intelligent Classification and Analysis of Essential Genes Using Quantitative Methods

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 16, Issue 1s
      Special Issue on Multimodal Machine Learning for Human Behavior Analysis and Special Issue on Computational Intelligence for Biomedical Data and Imaging
      January 2020
      376 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3388236
      Issue’s Table of Contents

      Copyright © 2020 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 April 2020
      • Accepted: 1 July 2019
      • Revised: 1 June 2019
      • Received: 1 May 2019
      Published in tomm Volume 16, Issue 1s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!