Gaucher disease, the most common lysosomal storage disease, is caused by mutations in the gene that encodes acid‐β‐glucosidase (GlcCerase). Type 1 is characterized by hepatosplenomegaly, and types 2 and 3 by early or chronic onset of severe neurological symptoms. No clear correlation exists between the ∼200 GlcCerase mutations and disease severity, although homozygosity for the common mutations N370S and L444P is associated with non‐ neuronopathic and neuronopathic disease, respectively. We report the X‐ray structure of GlcCerase at 2.0 Å resolution. The catalytic domain consists of a (β/α)8 TIM barrel, as expected for a member of the glucosidase hydrolase A clan. The distance between the catalytic residues E235 and E340 is consistent with a catalytic mechanism of retention. N370 is located on the longest α‐helix (helix 7), which has several other mutations of residues that point into the TIM barrel. Helix 7 is at the interface between the TIM barrel and a separate immunoglobulin‐like domain on which L444 is located, suggesting an important regulatory or structural role for this non‐catalytic domain. The structure provides the possibility of engineering improved GlcCerase for enzyme‐replacement therapy, and for designing structure‐based drugs aimed at restoring the activity of defective GlcCerase.
Acid‐β‐glucosidase (GlcCerase; otherwise known as D‐glucosyl‐N‐acylsphingosine glucohydrolase; IUBMB enzyme nomenclature number EC 188.8.131.52) is a peripheral membrane protein that hydrolyses the β‐glucosyl linkage of glucosylceramide (GlcCer; Fig. 1) in lysosomes, and requires the coordinated action of saposin C and negatively‐charged lipids for maximal activity (Beutler & Grabowski, 2001; Grabowski et al., 1990).
On the basis of sequence similarity, GlcCerase was classified as a member of glycoside hydrolase family 30, which is a member of the glycoside hydrolase A (GH‐A) clan. Inherited defects in GlcCerase result in lysosomal GlcCer accumulation and, as a consequence, Gaucher disease, the most common lysosomal storage disease (Meikle et al., 1999), which occurs at a frequency of 1 in 40,000 to 1 in 60,000 in the general population, and 1 in 500 to 1 in 1,000 among Ashkenazi Jews (Beutler & Grabowski, 2001; Charrow et al., 2000). Enzyme‐replacement therapy using Cerezyme®, a recombinant human GlcCerase (Grabowski et al., 1995), is the main treatment for type 1 Gaucher disease. Although attempts at structural prediction have been made (Fabrega et al., 2000, 2002), the lack of an experimental three‐dimensional structure of GlcCerase has hampered attempts to establish its catalytic mechanism and analyse the relationship between the mutations, levels of residual enzyme activity and disease severity. We now report the X‐ray structure of GlcCerase at 2.0 Å resolution and discuss how the common mutations may affect enzyme activity.
Results and Discussion
The refined X‐ray structure of GlcCerase at 2.0 Å (R‐factor 19.5%; R‐free 23.0%) contains two GlcCerase molecules per asymmetric unit (Tables 1,2). Its overall fold comprises three domains (Fig. 2). Domain I (residues 1–27 and 383–414) consists of one main three‐stranded, anti‐parallel β‐sheet that is flanked by a perpendicular amino‐terminal strand and a loop. It contains two disulphide bridges (residues 4–16 and 18–23), which may be required for correct folding (Beutler & Grabowski, 2001). Glycosylation, which is essential for catalytic activity in vivo (Berg‐Fussman et al., 1993), is seen in the crystal structure at residue N19. Domain II (residues 30–75 and 431–497) consists of two closely associated β‐sheets that form an independent domain, which resembles an immunoglobulin (Ig) fold (Orengo et al., 1997; Westhead et al., 1999). Domain III (residues 76–381 and 416–430) is a (β/α)8 TIM barrel, which contains the catalytic site, consistent with homology to GH‐A clan members (Fabrega et al., 2002; Henrissat & Bairoch, 1996). It contains three free cysteines (at positions 126, 248 and 342). Domains II and III seem to be connected by a flexible hinge, whereas domain I tightly interacts with domain III.
Site‐directed mutagenesis and homology modelling of GlcCerase (Fabrega et al., 2000, 2002) suggest that E235 is the acid/base catalyst, and tandem mass spectrometry identified E340 as the nucleophile (Miao et al., 1994). These two residues (Fig. 3A) are located near the carboxyl termini of strands 4 and 7 (Fig. 2B) in domain III, with an average distance between their carboxyl oxygens of 5.2 Å for the two GlcCerase molecules in the structure, consistent with retention of the anomeric carbon upon cleavage, rather than inversion (Davies & Henrissat, 1995). Residues D443 and D445, which are located in the Ig‐like domain (Fig. 2), cannot be directly involved in catalysis, although they seem to be covalently labelled (Dinur et al., 1986) by the irreversible GlcCerase inhibitor, conduritol‐B‐epoxide (Legler, 1977). Substrate docking shows that only the glucose moiety and the adjacent glycoside bond of GlcCer fit within the active‐site pocket (Fig. 3B), suggesting that the two GlcCer hydrocarbon chains either remain embedded in the lipid bilayer during catalysis or interact with saposin C. In addition, an annulus of hydrophobic residues surrounds the entrance to the active site (Fig. 3B) and may facilitate interaction of GlcCerase with the lysosomal membrane or with saposin C (Wilkening et al., 1998).
Of the ∼200 known GlcCerase mutations (Fig. 4), many are rare and restricted to a few individuals. Most mutations either partially or completely abolish catalytic activity (Meivar‐Levy et al., 1994) or are thought to reduce GlcCerase stability (Grace et al., 1994). The most common mutation, N370S, accounts for 70% of mutant alleles in Ashkenazi Jews and 25% in non‐Jewish patients (Table 3). N370S causes predisposition to type‐1 disease and precludes neurological involvement, suggesting that it causes relatively minor changes in GlcCerase structure and, therefore, catalytic activity. Consistent with this is the localization of N370 to the longest α‐helix (helix 7) in GlcCerase, which is located at the interface of domains II and III, but too far from the active site to participate directly in catalysis. Interestingly, several other mutations are found in this helix, all of which seem to point into the TIM barrel (Fig. 5).
Seven aromatic side chains (F128, W179, Y244, F246, Y313, W381 and F397) line one side of the active‐site pocket, and may be involved in substrate recognition, as in other β‐glycosidases (Chi et al., 1999; Henrissat & Bairoch, 1993). The common mutation V394L (Table 3) might perturb this lining, as the bulkier leucine side‐chain could cause a conformational change in two residues of the lining, Y244 and F246. Several other mutations (H311R, A341T and C342G; Fig. 4) occur near the active site and may directly affect catalytic activity. By contrast, two relatively common mutations (Table 3), R463C and R496H, which cause predisposition to mild disease (Beutler & Grabowski, 2001), are located in the Ig‐like domain, at a considerable distance from the active site (Fig. 2A). L444, which is mutated relatively frequently to proline or arginine and invariably causes predisposition to severe neuronopathic disease (Beutler & Grabowski, 2001; Erikson et al., 1997), is located in the hydrophobic core of the Ig‐like domain (Fig. 2). Either of the two L444 mutations might cause a local conformational change by disrupting the hydrophobic core, resulting in altered folding of this domain (Morel et al., 1999). This is consistent with the assumption that these mutations produce unstable proteins (Grace et al., 1994). This suggests an important regulatory or structural function for domain II, perhaps in interacting with saposin C and/or acidic phospholipids. Interestingly, β‐hexosaminidase and other family‐20 glycosidases have a similar non‐catalytic domain, the function of which is unknown (Mark et al., 2001). The structure of saposin C has recently been determined by nuclear magnetic resonance (NMR) spectroscopy (Protein Data Bank (PDB) ID code 1M12), but its coordinates have not yet been released to the public. However, the structure of its homologue, saposin B (Ahn et al., 2003), shows that the putative active form is a dimer in which a large hydrophobic cavity sequesters the acyl chains of cerebroside sulphate, and may serve to present it appropriately for hydrolysis by arysulphatase A. We cannot yet determine whether such a mechanism would explain the role of saposin C as an activator of GlcCerase, as the limited sequence homology (<14%) between saposins B and C does not allow accurate modelling of the latter. However, the Ig‐like domain of GlcCerase may regulate the interaction of GlcCerase with either the lipid bilayer, saposin C or both. Finally, there are no known viable mutations in residues 14–20 of domain I and in the connecting strand (residues 1–10) and loop (residues 21–27), with the exception of the conserved mutation V15L. However, there are seven known mutations in the C‐terminal strand of this domain (residues 401–414), including the common severe mutation D409H, which results in unstable protein (Beutler & Grabowski, 2001; Table 3). This suggests that domain I also has an important regulatory or structural role.
In summary, the GlcCerase structure will allow detailed and systematic analysis of the relationship between disease severity and perturbations in enzyme structure for each of the mutations (Fig. 4). It will also allow the structure‐based design of small molecules that may interact with misfolded GlcCerase and stabilize the structure of some common mutations, such as N370S. The feasibility of the latter approach has recently been shown by use of a chemical chaperone to enhance GlcCerase activity in cultured cells and in in vitro assays (Sawkar et al., 2002). Such an approach, together with the mechanistic information that can now be deduced from the GlcCerase structure, paves the way for new and improved therapeutic approaches for treating Gaucher disease.
Deglycosylation, crystallization and data collection.
Cerezyme® (5 mg) was dialysed overnight against PBS (pH 7.0) and deglycosylated using N‐glycosidase F (150 units; for 88 h at 25 °C). Deglycosylation was monitored by determining the reduction in molecular mass by SDS–polyacrylamide gel electrophoresis, and mass spectrometry showed the removal of 7–14 sugar residues. Deglycosylated Cerezyme® was concentrated to 10 mg ml−1 in 1 mM MES, pH 6.6, 0.1 M NaCl, 0.02% NaN3, using a Centricon YM‐10 centrifugal filter device, with a relative molecular mass cut‐off of ∼10 kDa. Crystals were obtained in hanging drops at 19 °C. The drops contained 1.5 μl Cerezyme® and 1.5 μl mother liquor (1 M (NH2)2 SO4, 0.17 M guanidine HCl, 0.02 M KCl, 0.1 M acetate, pH 4.6). Crystals were cryoprotected with a gradient of 5–25% glycerol. A heavy‐atom derivative was obtained by soaking for three days in KHgI2 liquid (diluted 1:125,000 in mother liquor). X‐ray data were collected at 100 K at three wavelengths around the Hg LIII absorption edge on beamline ID14‐4, and a native data set on beamline BM14 at the European Synchrotron Radiation Facility (ESRF). Cerezyme® crystallized in a C2221 spacegroup with two molecules in the asymmetric unit. Data were processed with MOSFLM/SCALA (Leslie, 1992) and Denzo/Scalepack (Otwinowski & Minor, 1997). See Table 1 for data collection statistics. The spacegroup and cell dimensions are similar to those recently reported for crystals of intact Cerezyme®, which diffracted, however, to significantly lower resolution (Roeber et al., 2003).
Structure determination and refinement.
Three Hg sites were located on the basis of their anomalous difference using SHELXD (Uson & Sheldrick, 1999). The Hg sites were refined and experimental phases to 2.3 Å were calculated from the multi‐wavelength anomalous diffraction (MAD) data using SHARP (Fortelle & Bricogne, 1997), resulting in an overall figure of merit (FOM) of 0.403. Phases were improved by applying solvent‐flipping density modification with SOLOMON (Abrahams & Leslie, 1996), resulting in an overall FOM of 0.851. An automated tracing procedure in ARP/wARP (Perrakis et al., 1999), using native amplitudes to 2.0 Å, coupled to the experimental phases, resulted in tracing of ∼95% of the two polypeptide chains. The SIGMAA map shows all 497 residues in both molecules. Final tracing was performed manually in the program O (Jones et al., 1991). Refinement of the two molecules was performed in REFMAC (Murshudov et al., 1999) and CNS (Brunger et al., 1998) at 2.0 Å, with an overall RMSD of 0.29 Å for Cα atoms between the two molecules. The maps show a single glycosylation site at N19, with one N‐acetylglucosamine (NAG) on one molecule and two on the other. Nine‐hundred and twenty‐eight water molecules, 15 sulphate ions and 3 NAG molecules were assigned. See Table 2 for refinement and model statistics. Coordinates and structure factors for native GlcCerase were deposited in PDB (accession code 10GS).
We thank Genzyme Israel, Ltd, for generously supplying Cerezyme®, the staff at beamlines ID14‐4 and BM14 at the ESRF, and O. Yifrach for help with data collection. This work was supported by the Yeda fund of the Weizmann Institute, the Kimmelman Center for Biomolecular Structure and Assembly, and the Benoziyo Center for Neurosciences. I.S. is the Bernstein–Mason Professor of Neurochemistry, J.L.S. is the Morton and Gladys Pickman Professor of Structural Biology and A.H.F. is the Joseph Meyerhoff Professor of Biochemistry.
- Copyright © 2003 European Molecular Biology Organisation