Title Machine learning methods for microbiome studies
Author Junghyun Namkung*
Address Data Analytics CoE, Data R&D Center, SK Telecom, Seoul 04539, Republic of Korea
Bibliography Journal of Microbiology, 58(3),206-216, 2020,
DOI 10.1007/s12275-020-0066-8
Key Words machine learning, microbiome, supervised, unsupervised, deep learning, semi-supervised
Abstract Researches on the microbiome have been actively conducted worldwide and the results have shown human gut bacterial environment significantly impacts on immune system, psychological conditions, cancers, obesity, and metabolic diseases. Thanks to the development of sequencing technology, microbiome studies with large number of samples are eligible on an acceptable cost nowadays. Large samples allow analysis of more sophisticated modeling using machine learning approaches to study relationships between microbiome and various traits. This article provides an overview of machine learning methods for non-data scientists interested in the association analysis of microbiomes and host phenotypes. Once genomic feature of microbiome is determined, various analysis methods can be used to explore the relationship between microbiome and host phenotypes that include penalized regression, support vector machine (SVM), random forest, and artificial neural network (ANN). Deep neural network methods are also touched. Analysis procedure from environment setup to extract analysis results are presented with Python programming language.