Secretory and membrane N‐linked glycoproteins undergo folding and oligomeric assembly in the endoplasmic reticulum with the aid of a folding mechanism known as the calnexin cycle. UDP–glucose glycoprotein:glucosyltransferase (UGGT) is the sensor component of the calnexin cycle, which recognizes these glycoproteins when they are incompletely folded, and transfers a glucose residue from UDP–glucose to N‐linked Man9‐GlcNAc2 glycans. To determine how UGGT recognizes incompletely folded glycoproteins, we used purified enzyme to glucosylate a set of Man9‐GlcNAc2 glycopeptide substrates in vitro, and determined quantitatively the glucose incorporation into each glycan by mass spectrometry. A ranked order of glycopeptide specificity was found that provides the criteria for the recognition of substrates by UGGT. The preference for amino‐acid residues close to N‐linked glycans provides criteria for the recognition of glycopeptide substrates by UGGT.
UDP–glucose glycoprotein:glucosyltransferase (UGGT) is an enzyme that resides in the endoplasmic reticulum (ER). UGGT recognizes incompletely folded glycoproteins with N‐linked Man9‐GlcNAc2 glycans, and transfers a glucose residue from UDP–glucose to produce the Glc1‐Man9‐GlcNAc2 glycan (for a review, see Parodi, 2000; Tessier et al., 2000). This monoglucosylated glycan is recognized specifically by the lectin‐like chaperones calnexin and calreticulin, which retain incompletely folded glycoproteins in the ER and recruit ERp57 to accelerate disulphide bond interchange for further folding (Zapun et al., 1997, 1998; Schrag et al., 2001).
Several in vitro studies have shown that denatured glycoproteins are the preferred substrates for UGGT (Sousa & Parodi, 1995; Tessier et al., 2000). It has been shown that denatured glycoproteins with a Man9‐GlcNAc2 glycan are glucosylated preferentially over the Man8 or Man7 glycoforms (Sousa et al., 1992), and the innermost GlcNAc residue is also required for recognition by UGGT (Sousa & Parodi, 1995). UGGT has also been shown to monoglucosylate the glycan in the unfolded monomer of an RNase B heterodimer specifically, whereas it leaves the glycan in the folded monomer unmodified (Ritter & Helenius, 2000). This targeted monoglucosylation may direct the chaperones and folding enzymes of the calnexin cycle to the regions of incompletely folded glycoproteins that require further folding.
In contrast to previous reports, we show that, in a system using purified components, UGGT has distinct peptide‐sequence preferences. We used defined Man9‐GlcNAc2 glycopeptides and purified, recombinant rat liver UGGT, and carried out quantitative analysis by mass spectrometry of both the substrates and the products. Across a set of 24 glycopeptide substrates with diverse sequences, the activity of UGGT varied 20‐fold.
Generation of glycopeptide substrates
The DT111 yeast strain (Tessier et al., 2000) was used to express glycoproteins that carry Man9‐GlcNAc2 glycans, which was confirmed for acid phosphatase (AcP), exo‐1,3‐β‐D‐glucanase (β‐Glc) and α‐galactosidase (α‐Gal) by matrix‐assisted laser desorption/ionization–time‐of‐flight (MALDI–TOF) mass spectrometry (Fig. 1A–C). However, in the case of AcP and α‐Gal, some of these sites are partially occupied (glycoprotein peaks separated by 1,865 Da, see Fig. 1A,B). Endoglycosidase H (Endo H) digestion confirmed the results obtained by mass spectrometry (Fig. 1D).
Purified denatured Man9‐GlcNAc2 glycoproteins from the yeast strain DT111 can be monoglucosylated using UDP–[3 H]glucose and UGGT (Fig. 2A, lane 1). Subsequent proteolytic digestion of these glycoproteins with trypsin resulted in the production of monoglucosylated fragments (Fig. 2A, lane 2). These proteins can be denatured, trypsin digested and the resulting glycopeptides can be used as effective substrates for recombinant UGGT (Fig. 2A, lane 3). These results provided the first evidence for the efficient recognition of glycopeptide substrates by UGGT, and allowed an analysis of the peptide features that are recognized by this enzyme. Mass spectrometry was used to detect and measure UGGT‐catalysed glucose incorporation into glycopeptides (Figs. 2B and 3). The procedure is based on the ratio of the relative ion intensities of the Man9‐GlcNAc2 to Glc1‐Man9‐GlcNAc2 forms of each glycopeptide. Hence, the unglucosylated Man9‐GlcNAc2 form of each glycopeptide provides an internal control, from which the molar incorporation of glucose was calculated.
A ranked order of monoglucosylated glycopeptides
A total of 24 glycopeptide substrates from proteolytic digests of AcP, α‐Gal and β‐Glc were identified as substrates for UGGT. These monoglucosylated peptides were ranked on the basis of glucose incorporation (Fig. 4). The glycopeptides were then grouped as good (group 1), intermediate (group 2) and poor (group 3) substrates for UGGT. However, even the most poorly recognized glycopeptide (number 24) showed a detectable level of monoglucosylation. The reproducibility of this data was confirmed by comparing the levels of monoglucosylation between identical peptides from different digestion conditions (compare closed symbols in Fig. 4). Glycopeptides with two glycans were treated as special cases, and are shown in Fig. 5.
Monoglucosylation of purified glycopeptides
The data shown for each experiment in Fig. 4 correspond to the monoglucosylation of mixtures of proteolytic digests of each glycoprotein with recombinant UGGT. To ensure that the identified ranked order of glucosylation was not a consequence of competition between the different glycopeptides in the mixture, the tryptic glycopeptides T248–270 (from group 1 in Fig. 4) and T386–395 (from group 3 of Fig. 4) from AcP were purified from the AcP digest by high‐performance liquid chromatography (HPLC; Fig. 6B) and used as substrates. A sevenfold difference between the two glycopeptides in the incorporation of glucose was calculated, which was consistent with the ninefold difference observed between the corresponding peptides in the tryptic mixture (Fig. 4; compare peptides 4 and 21).
Previous studies have been unable to show that glycopeptides are efficiently recognized substrates for glucosylation by UGGT (Helenius, 2001). The engineered DT111 yeast strain (Tessier et al., 2000) was used to produce homogeneous Man9‐GlcNAc2‐containing peptides with different amino‐acid sequences, which were recognized by UGGT with a wide range of efficiencies (Fig. 4). Close inspection of the amino‐acid sequences of the glycopeptides revealed two criteria for efficient recognition by UGGT: first, the presence of a polypeptide extension of ∼12 amino acids from the N‐linked glycan (Fig. 4). Second, hydrophobic patches, within both 1–3 and 6–14 amino acids, on the carboxy‐ or amino‐terminal sides from the N‐linked glycan (Fig. 4; shaded regions), as assessed using a number of hydrophobicity plots of the individual glycopeptides (Fig. 6A).
Although there are clear peptide composition and sequence preferences for efficient UGGT substrates, no common structural elements that distinguish good and poor substrates were found using a variety of algorithms to predict secondary structures (Appel et al., 1994).
Our data show that the amino‐acid sequence of glycopeptides profoundly affects their recognition as substrates for UGGT. Residues close to the N‐linked glycan seem to be the main determinants for recognition, and the presence of several proximal glycans does not seem to enhance recognition by UGGT (see Fig. 5, glycopeptide 3).
Although our data predict that the recognition of incompletely folded glycoproteins by UGGT is affected by the sequence surrounding the glycan, we cannot exclude the possibility that these peptide‐recognition elements could be provided by amino‐acid residues brought into proximity of the glycans in folding intermediates. Therefore, for glycoproteins with multiple N‐linked glycans, monoglucosylation of each glycan by UGGT would depend on the proximal peptide elements exposed during the folding process. This differential monoglucosylation could influence the behaviour of these glycoproteins in the calnexin cycle and in associated quality‐control pathways in the ER.
Mutant yeast glycoprotein substrates for recombinant rat UDP–glucose glycoprotein‐glucosyltransferase.
Uniform Man9‐GlcNAc2 glycoprotein substrates were produced using the Saccharomyces cerevisiae mutant strain DT111, in which the mnn1, mns1 and och1 genes were deleted (Tessier et al., 2000).
The medium used for the production of AcP has been described previously (Tessier et al., 2000). The medium used for the production of β‐Glc was prepared as described in Ramirez et al. (1989) with the following modifications: minimal medium SD‐Leu was supplemented with 2% glucose and 0.15 M KCl, and was buffered with 0.16 M citrate/0.08 M phosphate, pH 5.8. α‐Gal was expressed from the DT111 strain transformed with the mel1 plasmid pMP550 (Post‐Beittenmiller et al., 1984). Production of α‐Gal was carried out in minimal medium SD‐LEU‐URA supplemented with 2% galactose, 1% raffinose and 0.15 M KCl.
Purification of Glycoproteins.
Secreted glycoproteins expressed in the DT111 strain were prepared and purified on POROS 20HQ, as described previously (Tessier et al., 2000). AcP activity was monitored as described in Tessier et al. (2000). The hydrolysis of p‐nitrophenyl α‐D‐galactopyranoside (Sigma; used at 1 mg ml−1), which was carried out at 37 °C for 2–5 min in 25 mM sodium acetate, pH 5.0, was determined by measuring the optical density at a wavelength of 405 nm, and this was used to quantify α‐Gal activity. The reaction was terminated by adding saturated Na2 CO3 to one‐half of the volume of the original reaction mixture. β‐Glc was purified further using a 1‐ml Phenyl Superose column (Pharmacia) equilibrated with 1 M ammonium sulphate, buffered with 20 mM sodium acetate, pH 5.0, on a BioCAD FPLC system (PerSeptive Biosystems). β‐Glc was eluted with ∼0.6 M ammonium sulphate. Measurement of β‐Glc activity and termination of the reaction was carried out as described above for α‐Gal. Protein concentrations were determined by measuring the extinction coefficient at a wavelength of 260 nm.
UDP–glucose glycoprotein‐glucosyltransferase in vitro glucosylation assays.
Recombinant rat liver UGGT was expressed, purified and characterized, as described previously (Tessier et al., 2000). For all monoglucosylation reactions, AcP, β‐Glc and α‐Gal were subjected to chemical denaturation, reduction and alkylation of any free sulphhydryls with iodoacetamide, as described previously (Zapun et al., 1997). Glucosylation reactions were buffered at pH 8 in 50 mM Tris‐HCl and 5 mM CaCl2. Monoglucosylation reactions and proteolytic digestion conditions were as described below and in the figure legends.
Proteolytic and Endo H digestion of glycoprotein substrates.
Sequencing‐grade endoproteinase Glu‐C from Staphylococcus aureus (Roche) was used (mass ratio of 1:40, Glu‐C:protein) in 50 mM ammonium carbonate, pH 7.8, for 18 h at 25 °C. Digestion with Glu‐C was followed by evaporation to dryness and addition of 50 mM Tris‐HCl, pH 8.0, 5 mM CaCl2, with vortexing to redissolve the digested protein. Modified sequencing‐grade trypsin from bovine pancreas (Roche) was used (mass ratio of 1:40, trypsin:protein) in 50 mM Tris‐HCl, pH 8.0, 5 mM CaCl2 for 18 h at 37 °C. Digestion with trypsin was followed by boiling the reaction mixture for 15 min to inactivate the trypsin, and monoglucosylation reactions were then performed in this mixture.
Cleavage of the glycan from 2 μg of substrate glycoprotein was carried out using 5 μl of Endo H (Boehringer Mannheim) in 40 μl of 100 mM sodium acetate, pH 5.0, at 37 °C for 4 h.
MALDI–TOF mass spectra were collected on a PerSeptive Biosystems Elite‐STR spectrometer using the linear mode. The matrix, sinnapinic acid (Aldrich), was prepared as a 10 mg ml−1 solution in acetonitrile/methanol/water (1:1:1, v/v/v) and used for all MALDI–TOF analyses. The dimeric, singly and doubly protonated ions of bovine serum albumin were used to obtain an external calibration of the mass spectrometer. All electrospray mass spectrometry (ESMS) experiments were conducted using a Q‐TOF II (Micromass) hybrid quadrupole/time‐of‐flight instrument (Morris et al., 1996). A modular capillary liquid chromatography system (Micromass) was used for online micro‐liquid‐chromatography (μLC)‐ESMS experiments. Chromatographic separations were carried out using a 15 cm × 0.32 mm Pepmap C18 capillary column (LC Packings) using a linear gradient elution of 5–95% acetonitrile (0.2% HCOOH) for 40 min with a flow rate of 3.5 μl min−1. Conventional mass spectra were obtained by operating the quadrupole in a radio‐frequency‐only mode, while a pusher electrode was pulsed (16‐kHz frequency) to transfer all ions to the time‐of‐flight analyser. Mass spectra were acquired using stepped orifice voltage scanning similar to that described previously (Carr et al., 1993; Bateman et al., 1998). Tandem mass spectrometry experiments on glycopeptide peaks identified in preliminary μLC‐ESMS analyses were obtained in a subsequent sample injection for sequence analysis. Glycopeptide precursor ions were selected by the first quadrupole, while a pusher electrode was pulsed to transfer fragment ions formed in the RF‐only hexapole cell to the time‐of‐flight analyser. Mass spectral resolution (50% FWHM definition) was typically 4,000–5,000. Scan durations of 1 s and 2 s were set for conventional and tandem MS acquisition modes, respectively. Collision activation was carried out using argon as a collision gas, with a 25‐V offset between the direct current voltage of the entrance quadrupole and the RF‐only hexapole cell. Data were acquired and processed in the Mass Lynx Window NT based data system (Micromass).
Primary sequence analysis of glycopeptides.
The primary sequences of the 14 most highly ranked glycopeptides from Fig. 4 were analysed for a wide variety of characteristics using 42 of the available algorithms found in the ProtScale primary structure analysis tool on the Expasy website (http://ca.expasy.org/cgi‐bin/protscale.pl; Appel et al., 1994). For each algorithm, a window size of five was used with a relative weight of the window edges compared with the window centre of 10%, an exponential weight‐variation model and no normalization of the scale. The numeric output of each algorithm was compared between the top 14 glycopeptides in Fig. 4. The hydropathy plots from four algorithms (Hopp & Woods, 1981; Kyte & Doolittle, 1982; Mohana Rao & Argos, 1986; Cowan & Whittaker, 1990) showed clear differences between the group 1 and group 2 glycopeptides, as shown by the Kyte–Doolittle plots in Fig. 6A. For the glycopeptides ranked 8–10 (Fig. 4), the amino‐acid sequence N‐terminal to the N‐linked glycan was also analysed.
The authors thank Micromass for the loan of a Q‐TOF instrument, D. Krajcarski for technical assistance in the preliminary MALDI–TOF experiments and A. Migneault for assistance with the figures. This work was supported by grants from the Canadian Institutes of Health Research to D.Y.T. and J.J.M.B.
- Copyright © 2003 Nature Publishing Group