Title Introducing EzAAI: a pipeline for high throughput calculations of prokaryotic average amino acid identity
Author Dongwook Kim, Sein Park, and Jongsik Chun
Address Interdisciplinary Program in Bioinformatics, Institute of Molecular Biology & Genetics, School of Biological Sciences, Seoul National University, Seoul 08826, Republic of Korea
Bibliography Journal of Microbiology, 59(5),476–480, 2021,
DOI 10.1007/s12275-021-1154-0
Key Words average amino acid identity, comparative genomics, phylogeny, software suite
Abstract The average amino acid identity (AAI) is an index of pairwise genomic relatedness, and multiple studies have proposed its application in prokaryotic taxonomy and related disciplines. AAI demonstrates better resolution in elucidating taxonomic structure beyond the species rank when compared with average nucleotide identity (ANI), which is a standard criterion in species delineation. However, an efficient and easy-to-use computational tool for AAI calculation in large-scale taxonomic studies is not yet available. Here, we introduce a bioinformatic pipeline, named EzAAI, which allows for rapid and accurate AAI calculation in prokaryote sequences. The EzAAI tool is based on the MMSeqs2 program and computes AAI values almost identical to those generated by the standard BLAST algorithm with significant improvements in the speed of these evaluations. Our pipeline also provides a function for hierarchical clustering to create dendrograms, which is an essential part of any taxonomic study. EzAAI is available for download as a standalone JAVA program at http:// leb.snu.ac.kr/ezaai.