Comparative genomics of toxinotypes identifies module-based toxin gene evolution.


is a common cause of nosocomial diarrhoea. Toxins TcdA and TcdB are considered to be the main virulence factors and are encoded by the PaLoc region, while the binary toxin encoded in the CdtLoc region also contributes to pathogenicity. Variant toxinotypes reflect the genetic diversity of a key toxin-encoding 19 kb genetic element (the PaLoc). Here, we present analysis of a comprehensive collection of all known major toxinotypes to address the evolutionary relationships of the toxin gene variants, the mechanisms underlying the origin and development of variability in toxin genes and the PaLoc, and the relationship between structure and function in TcdB variants. The structure of both toxin genes is modular, composed of interspersed blocks of sequences corresponding to functional domains and having different evolutionary histories, as shown by the distribution of mutations along the toxin genes and by incongruences of domain phylogenies compared to overall cluster organization. In TcdB protein, four mutation patterns could be differentiated, which correlated very well with the type of TcdB cytopathic effect (CPE) on cultured cells. Mapping these mutations to the three-dimensional structure of the TcdB showed that the majority of the variation occurs in surface residues and that point mutation at residue 449 in alpha helix 16 differentiated strains with different types of CPE. In contrast to the PaLoc, phylogenetic trees of the CdtLoc were more consistent with the core genome phylogenies, but there were clues that CdtLoc can also be exchanged between strains.