Title |
Setup of a scientific computing environment for computational biology: Simulation of a genome-scale metabolic model of Escherichia coli as an example |
Author |
Junhyeok Jeon1 and Hyun Uk Kim1,2,3* |
Address |
1Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea, 2KAIST Institute for Artificial Intelligence, KAIST, Daejeon 34141, Republic of Korea, 3BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, Daejeon 34141, Republic of Korea |
Bibliography |
Journal of Microbiology, 58(3),227-234, 2020,
|
DOI |
10.1007/s12275-020-9516-6
|
Key Words |
computational biology, scientific computing environment, Python, Jupyter Notebook, Anaconda, Google Colaboratory, genome-scale metabolic model |
Abstract |
Computational analysis of biological data is becoming increasingly
important, especially in this era of big data. Computational
analysis of biological data allows efficiently deriving
biological insights for given data, and sometimes even
counterintuitive ones that may challenge the existing knowledge.
Among experimental researchers without any prior exposure
to computer programming, computational analysis
of biological data has often been considered to be a task reserved
for computational biologists. However, thanks to the
increasing availability of user-friendly computational resources,
experimental researchers can now easily access computational
resources, including a scientific computing environment
and packages necessary for data analysis. In this regard,
we here describe the process of accessing Jupyter Notebook,
the most popular Python coding environment, to conduct
computational biology. Python is currently a mainstream programming
language for biology and biotechnology. In particular,
Anaconda and Google Colaboratory are introduced as
two representative options to easily launch Jupyter Notebook.
Finally, a Python package COBRApy is demonstrated as an
example to simulate 1) specific growth rate of Escherichia coli
as well as compounds consumed or generated under a minimal
medium with glucose as a sole carbon source, and 2)
theoretical production yield of succinic acid, an industrially
important chemical, using E. coli. This protocol should serve
as a guide for further extended computational analyses of biological
data for experimental researchers without computational
background. |