The level of expression achieved by a gene is often simply associated with how efficiently it is transcribed. However, other factors can also have dramatic effects on gene expression levels. A mutant version of the human prothrombin gene that shows over‐expression of its protein product through activation of its poly(A) signal has recently been studied by Gehring et al. (2001). This apparently perturbs the delicate balance of the clotting cascade and may cause serious problems of thrombosis in affected individuals.
Producing a mature, translatable messenger (m)RNA from a mammalian gene requires an amazingly complex series of molecular tricks. First, the gene must be exposed from its hidden state in chromatin (Workman and Kingston, 1998). Then, RNA polymerase II (Pol II) must be seduced by multiple activation domains of transcription factors bound and lined up on the gene's promoter or enhancers to land on the transcription start site (Blackwood and Kadonaga, 1998). This of course occurs in conjunction with numerous other proteins (general transcription factors), so that the final initiation complex is in the mDa size range (Orphanides et al., 1996). The effectiveness of any one of these transcription initiation steps can directly determine not only the specificity, but also the level of gene expression. Complex rearrangements then occur to allow the polymerase to escape from the promoter and begin elongating through the gene (McKnight, 1996). However, with potential gene sizes of over a million nucleotides, the elongation process can be a marathon experience both in time (some 16 h) and in terms of RNA processing. Extensive intron splicing, both constitutive and alternative, occurs co‐transcriptionally (Lopez, 1998; Hirose and Manley, 2000), playing an essential role in generating protein diversity (Graveley, 2001) as well as assuring that the message produced is the exact sequence required to make the correct protein product (Hentze and Kulozik, 1999). Although splicing efficiency does not appear to have a major quantitative effect on gene expression, some splicing needs to take place to allow efficient nuclear export of mRNA (Zenklusen and Stutz, 2001). Transcriptional termination eventually stops the polymerase juggernaut through a process that is triggered by recognition of poly(A) signals in the nascent transcript (Proudfoot, 2000). However, poly(A) signals have another critical role: defining the site at which the nascent RNA is cleaved prior to the addition of the poly(A) tail. Only polyadenylated mRNA is exported from the nucleus to the cytoplasm, so that failure to add the poly(A) tail means that less mRNA will escape the highly active nuclear degradation processes (Bousquet‐Antonelli et al., 2000). Therefore, the efficiency of polyadenylation can have significant quantitative effects on gene expression. Furthermore, polyadenylation also affects the efficiency of translation in the cytoplasm. The poly(A) binding protein (PABPI), which coats the poly(A) tail, helps to recruit ribosomes to the mRNA cap site (at the other end of the mRNA). This process occurs through mRNA circularization mediated by PABPI interaction with cap binding proteins (Sachs et al., 1997).
Poly(A) signals have a quite precise arrangement (Zhao et al., 1999). The sequence AAUAAA is the most obvious and essential element. It is placed ∼20 nt upstream of the actual site of cleavage, which is usually an A residue. A GU‐ or U‐rich sequence then immediately follows on from the cleavage site, and its extent is often the main determinant of how efficiently this RNA processing signal will work. As shown in Figure 1, specific proteins bind to these two elements; cleavage polyadenylation specificity factor (CPSF) interacts with AAUAAA through its largest (160 kDa) subunit, and cleavage stimulation factor (CstF) interacts with the GU/U element through its 64 kDa subunit. CPSF also interacts with poly(A) polymerase (PAP), thereby recruiting it to the complex in preparation for polyadenylation once cleavage has taken place. Interactions between CPSF and CstF enhance their binding to the RNA signals and also somehow position two further cleavage factors (CFI and CFII) over the actual site of cleavage (Colgan and Manley, 1997; Wahle and Rüegsegger, 1999). This site must be precisely placed within the now very large protein complex to allow cleavage at the poly(A) site, by an as yet unidentified endonuclease. A further twist to the mechanism of factor assembly on the poly(A) signal and subsequent cleavage at the poly(A) site is that this whole process occurs in association with the Pol II elongation complex. In particular, the C‐terminal domain (CTD) of the Pol II large subunit, a seven amino acid sequence repeated 52 times, strongly activates cleavage and polyadenylation (Hirose and Manley, 1998). This effect of the Pol II CTD effectively couples polyadenylation to transcription as discussed in recent reviews (Hirose and Manley, 2000; Proudfoot, 2000). Once cleavage has occurred, poly(A) polymerase, already recruited to the protein complex, adds the poly(A) tail and is aided by PABPII, which stimulates poly(A) synthesis of up to 200 nt by successive binding to the growing poly(A) tail (Wahle, 1991). Other sequence features around the poly(A) signal have a modest effect on how well this final cleavage–polyadenylation complex binds to the poly(A) signal and in this way also influence how much polyadenylation actually occurs. In particular, the nucleotide to which the poly(A) tail is added following endonucleolytic cleavage (normally an A) has a small but significant effect on efficiency (Chen et al., 1995).
It is this exact nucleotide to which the poly(A) tail is added that has been highlighted quite unexpectedly in a commonly occurring mutation of the prothrombin gene in a recent publication by Gehring et al. (2001). This mutation (present in 1–2% of the human population) consists of a G→A substitution that makes the normally less than perfect poly(A) signal slightly more efficient (Figure 1). Furthermore, the presence of this nucleotide substitution correlates with an increased risk of thrombosis (Lane and Grant, 2000). Patients with this mutation appear to have elevated levels of prothrombin in their plasma, and this is likely to be the cause of their increased risk of thrombosis. To prove the point, Gehring et al. (2001) describe the isolation of the prothrombin poly(A) signal and its immediate sequence environment for both the wild type (G version) and mutant (A version). These two poly(A) signals were then used to replace the 3′ untranslated sequence of the human β‐globin gene and the hybrid genes so designed were transfected into tissue culture cells to measure gene expression. Sure enough, almost twice as much message is made by the A version as by the G version construct, which in turn results in twice as much protein expression. Further work on the A and G hybrid globin/prothrombin gene constructs revealed that cleavage at the poly(A) site is twice as efficient but that once cleavage has occurred, polyadenylation is apparently unchanged. What these data graphically illustrate is that the strength of a poly(A) signal can directly influence the level of gene expression. This fact had already been documented (e.g. Gil and Proudfoot, 1987), but the occurrence of a common genetic disorder associated with poly(A) signal efficiency underlines and adds a new dimension to this observation.
- Copyright © 2001 European Molecular Biology Organization