Structural Mass Spectrometry of Protein Assemblies

Top-down Proteomics

 

In Theory

In theory:

Analyzing a mixture of protein isoforms (proteoforms) with different combinations of PTMs by bottom-up is usually not successful, as most of the information is lost upon trypsin digestion (Figure 1). On the contrary, in a top-down analysis, each proteoform is analyzed and fragmented individually, providing extra-information on the primary structure of the isoforms. Although early top-down analyses used collision induced dissociation (CID), alternative dissociation techniques based on electron transfer (ETD) or capture (ECD), are now providing complementary fragmentation profiles, significantly increasing protein sequence coverages. Furthermore, ETD and ECD, unlike CID, do not lead to the loss of certain labile PTMs such as phosphorylations, explaining their more and more common use in proteomics.

Figure 1. The combination of post-translational modifications (green symbols), mutations (colored squares) and alternative splicing (dotted squares), can sometimes produce numerous protein isoforms or proteoforms. After digestion (bottom-up, left), it is impossible to know the origin of the tryptic peptides. In top-down (right), each proteoform is selected and fragmented individually, allowing their discrimination and identification.
In Practice

In Practice:

Protein mixtures (~ 2 pmol / 50ng) are desalted on C4 cartridges and separated on C4 nano-columns.

The multicharged raw data (Figure 2A) can be deconvoluted to obtain very accurate Molecular Weights with isotopic distributions (Figure 2B).

 

Figure 2: MS spectra corresponding to a ~15 kDa protein.

For more complex mixtures, raw data can be automatically deconvoluted and 3D plots generated, showing the pattern of the most abundant proteins (Figure 3).

Figure 3: Protein footprint of an immunopurified sample.

Targeted top-down analysis (ETD fragmentation) can provide information on post-translational modifications (PTMs) and protein identification (see examples below).

Proteoform repertoire of M. tuberculosis LpqH lipoglycoprotein

collaboration with Jerome Nigou’s team (Michel Rivière, ANR TbGlycoPrOmics)

In this study, we used a combination of top-down and bottom-up LC-MSMS to fully describe the proteoforms of the virulence associated LpqH antigen from Mycobacterium tuberculosis. Although this protein was described to be both acylated and glycosylated, there was no evidence for the concomitant presence of these two modifications.

Starting from His-tagged protein recombinantly expressed in Mycobacterium smegmatis, we could identify by LC-MS more than 130 different proteoforms (Figure 4A). The mass accuracy (<1 Da error) allowed to identify 13 cleavage sites, from 0 to 9 hexose units and 0, 2 or 3 acylations. After tryptic digestion, the peptide 27-51 was found with 0 to 9 glycosylations. Bottom-up ETD highlighted a sequential glycosylation of the residues Thr41/35/34 and 36 (Figure 5A) while top-down ETD identified a phosphorylation on the C-terminal part of the pre-prolipoprotein (Figure 5B). Altogether these results enabled a better description of the proteoforms of LpqH, opening the door to further experiments that should reveal their functions.

Figure 4. (A) LC-MS analysis of His-tagged LpqH from M. tuberculosis allowed the separation of more than 130 proteoforms. (B) Deconvolution of the first chromatographic peak (I) identifies unacylated and truncated forms of the protein with or without mannose decorations. The number of mannose units (2-8) is indicated for each primary sequence (color-coded). Adapted from (Parra et al, 2017).

Figure 5. ETD fragmentation of peptide 27-51 allowed the identification of 4 glycosylation sites (A) and top-down ETD localized a phosphorylation site on stretch 146-154. Adapted from (Parra et al, 2017).

 

Identification of combinations of PTMs of porins from C. glutamicum

collaboration with Mamadou Daffe’s and Alain Milon’s teams (Marie Renault, ANR CoryNMR)

The main purpose of this project was to understand how porins from Corynebacterium glutamicum are addressed to the mAGP complex of the mycomembrane. Interestingly, fractionation of the cell lysates showed that these porins were not only addressed to the mAGP complex, but also found in the secretomes (Figure 6A). We set out to analyse these different fractions by top-down LC-MS and found that the extracellular porins were unmodified whereas the porins embedded in the mAGP were either mono or di-mycoloyted (Figure 6B-C).

Figure 6. (A) Expression of the porins A/H/B/C was found in both the mAGP complex and the extra-membranous (EM) fractions. LC-MS analysis could separate (B) and identify unmodified porins in the EM and mono/di mycoloylated porins in the mAGP based on their accurate MWs (C). Adapted from (Carel et al, 2017).

Further top-down LC-MSMS experiments (Figure 7A) enabled a detailed investigation of these fractions, revealing specific and well-conserved PTMs, including O-mycoloylation, pyroglutamylation and N-formylation (Figure 7B). PTM site sequence analysis from C. glutamicum outer membrane proteins (OMP) and other O-acylated proteins in bacteria and eukaryotes revealed unique patterns (Figure 7C). Furthermore, we found that such modifications were essential for targeting to the mycomembrane and sufficient for OMP assembly into mycolic-acid containing lipid bilayers. Collectively, it appears that these PTMs have evolved in the Corynebacteriales order and beyond, potentially to guide membrane proteins towards a specific cell compartment.

Figure 7. (A) MSMS spectrum of the di-mycoloylated porin B showing characteristic y fragments obtained by Collision Induced Dissociation (CID) and enabling the localization of PTMs. (B) Summary of all the PTMs identified on the four porins. (C) Moderate consensus of O-mycoloylation highlighted in this study. Adapted from (Carel et al, 2017).

 

References

2017

Parra J*, Marcoux J*, Poncin I, Canaan S, Herrmann JL, Nigou J, Burlet-Schiltz O, Rivière M (2017) “Scrutiny of Mycobacterium tuberculosis 19kDa antigen proteoforms provides new insights in the lipoglycoprotein biogenesis paradigm” Scientific Reports 7:43682.

Carel C*, Marcoux J*, Parra J, Réat V, Latgé G, Laval F, Burlet-Schiltz O, Demange P, Milon A, Daffé M, Tropis M, Renault M (2017) “Post-translational O-mycoloylation mediates protein targeting to the mycomembrane in C. glutamicumProceedings of the National Academy of Sciences 2017 Apr 3. pii: 201617888.