Title Discovery of novel glycoside hydrolases from C-glycoside-degrading bacteria using sequence similarity network analysis
Author Bin Wei1,2,3†, Ya-Kun Wang1†, Jin-Biao Yu1, Si-Jia Wang1,4, Yan-Lei Yu1, Xue-Wei Xu3*, and Hong Wang1,2*
Address 1College of Pharmaceutical Science & Collaborative Innovation Center of Yangtze River Delta Region Green Pharmaceuticals, Zhejiang University of Technology, Hangzhou 310014, P. R. China, 2Key Laboratory of Marine Fishery Resources Exploitment & Utilization of Zhejiang Province, Hangzhou 310014, P. R. China, 3Key Laboratory of Marine Ecosystem and Biogeochemistry, State Oceanic Administration & Second Institute of Oceanography, Ministry of Natural Resources, Hangzhou 310012, P. R. China, 4Center for Human Nutrition, David Geffen School of Medicine, University of California, Los Angeles, California 90024, USA
Bibliography Journal of Microbiology, 59(10),931-940, 2021,
DOI 10.1007/s12275-021-1292-4
Key Words C-glycosides, puerarin, daidzin, sequence similarity network, molecular docking, inhibition
Abstract C-Glycosides are an important type of natural product with significant bioactivities, and the C-glycosidic bonds of C-glycosides can be cleaved by several intestinal bacteria, as exemplified by the human faeces-derived puerarin-degrading bacterium Dorea strain PUE. However, glycoside hydrolases in these bacteria, which may be involved in the C-glycosidic bond cleavage of C-glycosides, remain largely unknown. In this study, the genomes of the closest phylogenetic neighbours of five puerarin-degrading intestinal bacteria (including Dorea strain PUE) were retrieved, and the protein-coding genes in the genomes were subjected to sequence similarity network (SSN) analysis. Only four clusters of genes were annotated as glycoside hydrolases and observed in the genome of D. longicatena DSM 13814T (the closest phylogenetic neighbour of Dorea strain PUE); therefore, genes from D. longicatena DSM 13814T belonging to these clusters were selected to overexpress recombinant proteins (CG1, CG2, CG3, and CG4) in Escherichia coli BL21(DE3). In vitro assays indicated that CG4 efficiently cleaved the O-glycosidic bond of daidzin and showed moderate β-D-glucosidase and β-D-xylosidase activity. CG2 showed weak activity in hydrolyzing daidzin and pNP- β-D-fucopyranoside, while CG3 was identified as a highly selective and efficient α-glycosidase. Interestingly, CG3 and CG4 could be selectively inhibited by daidzein, explaining their different performance in kinetic studies. Molecular docking studies predicted the molecular determinants of CG2, CG3, and CG4 in substrate selectivity and inhibition propensity. The present study identified three novel and distinctive glycoside hydrolases, highlighting the potential of SSN in the discovery of novel enzymes from genomic data.