Title Application of computational approaches to analyze metagenomic data
Author Ho-Jin Gwak1, Seung Jae Lee1, and Mina Rho1,2*
Address 1Department of Computer Science and Engineering, Hanyang University, Seoul 04763, Republic of Korea, 2Department of Biomedical Informatics, Hanyang University, Seoul 04763, Republic of Korea
Bibliography Journal of Microbiology, 59(3),233–241, 2021,
DOI 10.1007/s12275-021-0632-8
Key Words microbiome, metagenome, metatranscriptome, assembly, contig binning, classification, functional potential
Abstract Microorganisms play a vital role in living systems in numerous ways. In the soil or ocean environment, microbes are involved in diverse processes, such as carbon and nitrogen cycle, nutrient recycling, and energy acquisition. The relation between microbial dysbiosis and disease developments has been extensively studied. In particular, microbial communities in the human gut are associated with the pathophysiology of several chronic diseases such as inflammatory bowel disease and diabetes. Therefore, analyzing the distribution of microorganisms and their associations with the environment is a key step in understanding nature. With the advent of nextgeneration sequencing technology, a vast amount of metagenomic data on unculturable microbes in addition to culturable microbes has been produced. To reconstruct microbial genomes, several assembly algorithms have been developed by incorporating metagenomic features, such as uneven depth. Since it is difficult to reconstruct complete microbial genomes from metagenomic reads, contig binning approaches were suggested to collect contigs that originate from the same genome. To estimate the microbial composition in the environment, various methods have been developed to classify individual reads or contigs and profile bacterial proportions. Since microbial communities affect their hosts and environments through metabolites, metabolic profiles from metagenomic or metatranscriptomic data have been estimated. Here, we provide a comprehensive review of computational methods that can be applied to investigate microbiomes using metagenomic and metatranscriptomic sequencing data. The limitations of metagenomic studies and the key approaches to overcome such problems are discussed.