Laboratory of Structural Biology: Bochtler Laboratory

   Our group currently works on sequence- and modificationspecific protein-nucleic acid interactions. With an FNP Team grant to support our work, the focus is now almost exclusively on DNA methylation and hydroxymethylation. These modifications are present in prokaryotes and eukaryotes, but they have very different roles in these organisms. Our group seeks to exploit the prokaryotic biology of DNA methylation and hydroxymethylation to develop tools for the study of these modifications in eukaryotes, particularly zebrafish and mice. In 2012, our efforts were concentrated on one story that emphasizes the deep evolutionary roots of DNA methylation.

    DNA cytosine methylation in many eukaryotic species is predominantly found in the context of the CpG dinucleotide. It is thought to be essential in mammals and many other model animals, with the notable exception of the fruit fly Drosophila melanogaster. However, cytosine methylation comes at a price. Genome-wide studies have consistently shown that CpG methylation in eukaryotes is associated with CpG depletion. As a result, the CpG dinucleotide is found several-fold less frequently in nuclear DNA of higher mammals, including humans, than one might expect based on the GC content of the DNA. The reasons for the link between CpG methylation and depletion are both chemical and biological. At the chemical level, cytosine methylation promotes deamination and leads to thymines. More importantly, at the biological level, methylcytosines converted to thymines are difficult to identify as a damage product and difficult to repair to cytosines through DNA repair pathways. Both the chemical and biological arguments for the link between CpG methylation and depletion are fairly fundamental and should apply to all kingdoms of life.

   Hence, one can ask the question, “Is it possible to discover novel prokaryotic CpG methyltransferases (like the previously found CpG specific M.SssI) by searching bacterial genomes for CpG depletion?” We have performed exactly this and scanned all fully sequenced bacterial genomes in the NCBI sequence collection for CpG underrepresentation. We found several drastically (i.e., approximately 10-fold) CpG-depleted bacterial species. Although we found differences in the statistical signatures of CpG depletion when compared with eukaryotes (e.g., in the comparison of coding and on-coding regions), we followed this observation by analyzing bacteria with drastic CpG depletion for CpG methylation. In the case of Mycoplasma penetrans, a genome-wide study of cytosine methylation using bisulfite sequencing identifi ed global CpG methylation and several other universally methylated sequences. To identify the M. penetrans CpG methyltransferase, we picked a candidate protein on the basis of remote amino acid sequence similarity to M.SssI. Using bisulfite sequencing and other CpG methylation assays (i.e., HpaII/MspI digestion), we demonstrated in vitro that our candidate enzyme was indeed a CpG-specific DNA methyltransferase and hence named it M.MpeI in accordance with nomenclature guidelines.


Fig. 1. (A) Initial screen for CpG depletion in bacterial genomes. The GpC/CpG dinucleotide ratio was used as a “proxy” for CpG depletion in order to normalize for the GC content. Every dot in the diagram represents one bacterial genome. (B-D) The same for related dinucleotide ratios as a control.


 How does M.MpeI recognize its CpG target sequence with extraordinary specificity? To answer this question, we crystallized M.MpeI with target DNA and solved the structure. Unsurprisingly, we found typical features of CpG methyltransferase DNA complexes, such as flipping of the substrate cytosine and the proximity of the co-factor to the substrate base for direct transfer of a methyl group. Very interestingly, the DNA structure was perturbed not only in the substrate strand but also in the complementary strand. In this strand, we detected intercalation of a phenylalanine residue between the C and G nucleotides of the CpG site. The 5'-pyrimidine purine-3' steps are thought to be easier to unstack than other dinucleotide steps. Hence, intercalation might contribute to CpG readout. This concept is supported by the recent structure of the eukaryotic DNA maintenance methyltransferase Dnmt1 in complex with target DNA, which also shows unstacking of the CpG step.



Fig. 2. Structures of C5 methyltransferases in productive complexes with target DNA. Only the structure of M.MpeI with DNA is our work, the other two structures were drawn according to coordinates from other laboratories.


    If CpG methylation damages genomes, then what is the benefi t for bacteria to retain a CpG-specific DNA methyltransferase? We are presently unable to answer this question, but several possibilities exist. The methyltransferase might be part of a CpG-specific restriction modification system. Alternatively and somewhat improbably in light of our genome-wide methylation data, it might play a role as an epigenetic regulator. Finally, bacterial CpG methylation might involve host pathogen interactions. Although the claim is still debated, most authors now agree that CpG-unmethylated DNA is far more immunogenic than CpG-methylated DNA.

    Hence, CpG methylation might help bacteria dodge the host immune system. If so, then our findings could also have medical applications because at least some of the CpG-specific DNA methyltransferases are found in human pathogens.