We analysed the genome‐wide regulatory properties of an artificial transcription activator in which the DNA‐binding domain of the yeast transcription factor, Pdr1, was fused to the activation domain of Gal4 (Pdr1*GAD). This Pdr1*GAD chimera was put under the control of the inducible GAL1 promoter. DNA microarray analyses showed that all the target genes upregulated by the well‐studied native gain‐of‐function Pdr1‐3 mutant were similarly activated by the chimerical factor Pdr1*GAD upon galactose induction. Additionally, this kinetic approach led us not only to confirm previously published targets, but also to define a hierarchy among members of the Pdr1 regulon. Our observations prove, for the first time at the complete genome level, that the DNA‐binding domain of Pdr1 is sufficient to guide its specificity. We propose that this approach could be useful for the study of new transcription factors identified in silico from sequenced organisms. Complete data are available at www.biologie.ens.fr/yeast‐publi.html.
The ability to specifically manipulate the expression of endogeneous genes or classes of genes would have wide‐ranging applications in medicine and in experimental and applied biology. An efficient control of the genome expression processes could be accomplished via the manipulation of transcription factors and via an accurate understanding of gene regulation which controls genome expression (Holstege and Young, 1999). In that respect, transcription factor properties associated with the DNA microarray technology can be regarded as precious tools to decipher the intricacy of gene regulation networks. The aim of this work was to assess the genome‐wide regulatory properties of a specific transcription factor DNA‐binding domain, in order to design a general strategy to reveal the target genes of new yeast transcription factors in the absence of any other functional information. For that purpose, the yeast transcription factor Pdr1 was taken as a model system. Pdr1 controls the transcription of genes whose products are mainly involved in membrane composition, such as the ABC transporter Pdr5 (Balzi, 1994), and it plays a fundamental role in the pleiotropic drug resistance phenomenon. A recent genome microarray analysis of a constitutively expressed PDR1 gain‐of‐function mutant (PDR1‐3) has shown that Pdr1 can upregulate 26 genes and downregulate 22 genes (DeRisi et al., 2000). However the interpretation of such data can be obscured by the variety of secondary effects that might be caused by a constitutive expression of a gain‐of‐function allele (DeRisi et al., 2000). Moreover, the generalization of this kind of approach is limited to transcription factors with known native activated alleles, when in many cases only structural data based on the sequence are available. With these problems in mind we designed an experiment in which an artificially activated form of Pdr1 was progressively produced under the control of the GAL1‐10 promoter. Pdr1 has a modular structure typical of the 56 Gal4 family members (Johnston et al., 1986; Balzi et al., 1987): a short N‐terminal Zn(II)2Cys6 binuclear cluster DNA‐binding domain (100 amino acids) is linked to a C‐terminal activation domain (113 amino acids) via a rather long internal region (668 amino acids) which contains multiple inhibitory domains (Stone and Sadowski, 1993). Recently, a Gal4 version with a large internal deletion was shown to possess all the known properties of the full‐length activated Gal4 protein (Ding and Johnston, 1997). The rationale of our approach was to construct an artificially active transcription factor containing only the DNA‐binding region from Pdr1, fused to the activated domain of Gal4. We tested whether this chimerical construct could create a Pdr1‐specific transcription activator. Therefore, yeast microarray analysis was used to assess the quantitative differences induced in the yeast transcriptome by a time course controlled expression of this chimerical transcription factor. Our results are consistent with previous studies of Pdr1 targets (DeRisi et al., 2000) and they significantly improve our understanding of the Pdr1 regulatory networks. They demonstrate that the Pdr1 DNA‐binding domain alone is sufficient to confer the target gene specificity of the native transcription factor. We thus suggest that this strategy could be of general value to determine the function of unknown transcription factors identified by genome sequencing.
We reasoned that hybrid transcriptional activators, composed of a variable DNA‐binding domain fused to a well known activation domain like the Gal4‐activation domain (GAD), could generate a transcription activator reflecting the specificity of the DNA‐binding domain. We then fused the DNA‐binding domain of Pdr1 (positions 1–207) to the Gal4‐activation domain (positions 768–881) (Figure 1). This short chimerical protein, called Pdr1*GAD, also contains the SV40 nuclear targeting signal, three repeats of an HA epitope and the putative dimerization domains of both Pdr1 and Gal4, as these factors are supposed to act as homodimers. To control the production of Pdr1*GAD, the chimerical gene was placed under the control of the GAL1‐10 promoter in a centromeric plasmid (Figure 1A). The control samples were conducted with the same plasmid expressing only GAD. After galactose induction, the time course production of the Pdr1*GAD protein was followed by gel electrophoresis analysis (Figure 1B).
Physiological properties of the chimerical activator Pdr1*GAD
We first verified that the production of Pdr1*GAD was conferring the same high level of drug resistance as the native gain‐of‐function mutant, Pdr1‐3 (Carvajal et al., 1997), in cells lacking both PDR1 and its functionally redundant homologue PDR3 (Figure 1C). This result suggests that the physiological activity of the chimera is similar to that of the Pdr1‐3 gain‐of‐function mutant.
Global expression profile analysis of yeast cells expressing the artificial Pdr1*GAD transcriptional activator
We have used yeast whole‐genome DNA microarrays to compare the transcriptional properties of Pdr1‐3 and Pdr1*GAD. Expression profiles were generated in Saccharomyces cerevisiae lacking PDR1 and PDR3. We first carried out five independent DNA microarray experiments comparing the presence or absence of the PDR1‐3 gain‐of‐function allele. The results are in general agreement with similar previous experiments (DeRisi et al., 2000). However, repeat experiments with different sets of DNA microarrays allowed us to obtain more accurate results. A few genes: COS10, YGP1, HXK1, MET17, YGR243w, YGR212w, YNR067c and YGL028c, which were previously suggested as being upregulated in the presence of PDR1‐3, did not exhibit any significant expression changes in these new experiments. These transcripts have been shown to fluctuate more than others in 63× repeats of a wild‐type transcriptome (Hughes et al., 2000), and they are probably false‐positive candidates. On the other hand, a new set of genes was significantly upregulated in the present study. These are FSP2, PDR10, RPN4, YAL061w, YCR061w, YIl172c, YJL216c and YMR102c. PDR1‐3‐dependent upregulation of some of these genes was checked by northern blot analyses (data on web site). Moreover, the fact that all these genes have at least one pleiotropic drug responsive element [PDRE, the binding site of Pdr1 as defined in DeRisi et al. (2000)] in their promoter gives credence to these new findings. These experiments allowed us to define a new pattern of Pdr1‐3 activated targets, which was used as a reference to assess the validity of the PDR1*GAD approach.
Microarray analyses of the time‐controlled expression of Pdr1*GAD protein, compared with a similar time course production of GAD alone, reveal that the longest galactose time‐induction (18 h) actually mimics the constitutive production of the Pdr1‐3 gain‐of‐function transcription factor. A cluster analysis of the upregulated genes revealed by the time course experiments is presented in Figure 2. Simple visual comparisons of data from Pdr1‐3 (Figure 3A, left) and Pdr1*GAD (Figure 3A, right) experiments reveal the strong similarity between the transcriptional activity of the physiological gain‐of‐function mutant and the artificial transcription factor. The large majority of upregulated genes are common to both experiments. All but one of these genes exhibit a PDRE in their promoter, and are then strong candidates for a direct activation by the products of both PDR1 alleles. Four genes contain PDRE and seem to be specific for PDR1*GAD activation; two of them, HXT9 and HXT11 (Nourani et al., 1997), are published targets of PDR1 and the two others are their closest homologues: HXT8 and HXT12 (98% identities). We believe that this is an advantage of the kinetic studies: to reveal genes that were not significantly activated in a single condition experiment.
A more global comparison of Pdr1*GAD versus Pdr1‐3 expression made by scatter plot did not reveal any striking differences between the two types of experiments (see Supplementary data).
Time course expression of the chimerical transcription factor Pdr1*GAD reveals waves of activated genes
Systematic studies of target genes associated with a transcription factor are often limited by our ability to distinguish between primary and secondary effects. We reasoned that a time course production of the transcription factor should allow us to distinguish stochastic fluctuations from biologically relevant variations. In particular, the early activated genes are more likely to correspond to direct targets of this transcription factor. These early Pdr1 induced genes appear in the cluster analysis of Figure 2. We analysed these data by principal components analysis (PCA) (Figure 4). This representation discriminates the specific transcriptional activations (Figure 4, right) or repressions (Figure 4, left) by Pdr1‐3 and Pdr1*GAD from the majority of genes whose expression remains unchanged. It shows a gradient of activated genes. However, we could not find a clear relationship between the number or the location of the PDRE in the promoters and the kinetics of activation. Most of the genes that are repressed by a constitutive expression of Pdr1‐3 are similarly downregulated by the chimera (see enhanced Supplementary Figure). Interestingly the repressed genes appear only at a later stage of induction of Pdr1*GAD, suggesting that this repression is a secondary effect of Pdr1*GAD overproduction. This is in agreement with the absence of PDRE in their promoter sequences. We can not exclude the possibility that Pdr1 has some repression effects, independent of its DNA‐binding domain, but this point can not be assessed using the Pdr1*GAD approach.
New insights in the PDR1 regulon
Time course expression of the chimerical Pdr1*GAD protein and repetition of microarray experiments with the Pdr1‐3 gain‐of‐function mutant allowed us to apply statistical methods to analyse the data and to define a new set of Pdr1 upregulated genes. In previous array experiments (DeRisi et al., 2000) the 26 genes that were shown to have increased expression in Pdr1‐3 gain‐of‐function mutant were divided into five functional groups: ABC transporters, major facilitator superfamily (MFS) and other permeases, lipid metabolism, cell wall metabolism and stress response. None of the genes of the stress response group apart from one (GRE2) were significantly upregulated. Repetitions of experiments with the PDR1‐3 mutant clearly showed that these stress group genes are not directly upregulated by Pdr1. This is supported by the fact that none of these genes have a PDRE in their promoter sequence. On the other hand, new Pdr1 target genes have been discovered. An interesting case is that of SON1/RPN4. This gene has been described as coding for a transcription regulator for genes encoding subunits of the proteasome (Mannhaupt et al., 1999). We might suspect that RPN4 was activated as a rather trivial response to Pdr1*GAD or Pdr1‐3 overproduction. This is unlikely since we never observed such a signal with other transcription factor*GAD fusions (data not shown). Moreover, two PDRE elements are localized in the RPN4 promoter and the connection recently established between RPN4 and the yeast response to alkylating agents strengthens the idea that the proteasome function and the pleiotropic drug resistance phenomenon could be linked (Jelinsky et al., 2000).
A general strategy to decipher the genomic signature of a transcription factor
It is known that a given transcriptional regulator may activate transcription of one gene, repress transcription of another and bind but exert no regulatory effect on a third (Lefstin and Yamamoto, 1998). This transcriptional flexibility is probably in part controlled by the central tethering region of factors like the Gal4 family members (Carvajal et al., 1997), and also by functional interactions between DNA‐binding domains and activation domains as described for the USF2 transcription factor (Luo and Sawadogo, 1996). In such a context, the present domain‐swap experiment in which the DNA‐binding and dimerization domain of Pdr1 is directly linked to the Gal4 activation domain may seem naïve. We demonstrate, however, that this chimerical transcriptional activator Pdr1*GAD generates a transcriptome profile very similar to that of the ‘natural’ Pdr1‐3 gain‐of‐function mutant. These two forms of Pdr1, which share only the Pdr1 DNA‐binding domain, upregulate the same set of 23 PDRE‐containing genes (Figure 3A). This implies that the Pdr1 DNA‐binding domain discriminates faithfully these 23 genes among the 218 PDRE‐containing genes of the yeast genome. For the moment, we did not find clear differences between the consensus sequence defined from these 23 regulated genes (Figure 3B) and the PDRE found in the promoters of the non‐regulated genes. This strongly suggests that the Pdr1 DNA‐binding domain harbours activities beyond simple DNA recognition, a point which is supported by recent observations on several transcription factors (Lefstin and Yamamoto, 1998). We have evidence that such an approach could be efficient for other S. cerevisiae Gal4 family members (work in progress) and thus may be extended to similar transcription factors in other eukaryotes and pathogenic yeasts. Moreover, it might be extended to other classes of transcription factors, provided that their DNA‐binding domain can be clearly defined and that they do not need any cofactor for their target specificity.
The principal objective of this work was to define a gene target search based only on the available transcription factor structural data. We state that, in the case of Pdr1, the knowledge of the DNA‐binding domain is necessary and sufficient to define the corresponding pattern of regulation. It would not be correct to claim that such a pattern of regulation reflects all the properties of the transcription factor. The regulation pattern that we found for both the chimerical Pdr1*GAD and the Pdr1‐3 gain‐of‐function mutant reflects a similar derepressed state of the protein. In that respect the upregulated genes represent the wide regulatory network of Pdr1. We propose to call this set of genes the genomic signature of the transcription factor. This genomic signature can be connected to other gene expression groups recently identified (Hughes et al., 2000). For instance, PDR15, SNQ2, YOR1, YAL061w, YMR102c, YOR049c belong to the same functional cluster and are an important part of the Pdr1 regulon. The Pdr1 genomic signature can thus be considered as composed of several functionally related groups of genes. Kinetic studies and PCA (Figure 4) can also distinguish gene subgroups inside this genomic signature.
In conclusion, we believe that the systematic analyses of genomic signatures of known or unknown transcription factors, combined with more physiological approaches such as the compendium, should be an important asset in understanding the global functional properties of a genome. The approach described in this work should help in reaching this goal in yeast, and may be extrapolated to any sequenced organism.
Plasmid pCB‐GAD was obtained from the plasmid pCB [derived from pYES2 (Invitrogen): URA3, AmpR, pMB1 ori, ARS4‐CEN6 ori, GAL1 promoter, three HA epitopes and CYC1 terminator] as follows: ‘SV40 NLS/Gal4 activation domain’ sequence was amplified by PCR from the two‐hybrid plasmid pACT2 (see oligonucleotide sequences at: www.biologie.ens.fr/fr/genetiqu/puces/pucesadnframe.html). This PCR product was cloned at the XhoI site of the plasmid pCB by homologous recombination in S. cerevisiae. Plasmid pCB‐PDR1*GAD was constructed by insertion of the 5′ 620 first nucleotides of the PDR1 open reading frame in the pCB‐GAD plasmid. This sequence was amplified by PCR and cloned at the NotII site of the pCB‐GAD plasmid by homologous recombination. For the gain‐of‐function mutant experiments, we used the centromeric pRS315 vector containing the PDR1‐3 gain‐of‐function allele of PDR1 under the control of its own promoter (Carvajal et al., 1997). The empty pRS315 was used as a control.
The yeast strains used in this study were isogenic to FY1679‐28C Δpdr3 Δpdr1 (Mata ura3‐52 trp1Δ63 leu2Δ1 his3Δ200 Δpdr3 ::HIS3 Δpdr1 ::TRP1).
FY Δpdr3 Δpdr1 cells in exponential growth, containing either the pCB‐PDR1*GAD or the pCB‐GAD vectors were shifted from glucose to galactose minimal media (2% sugar, 0.67% yeast nitrogen base without amino acids, supplemented with adenine and leucine). Cells were then collected at different times (from 0 to 18 h) after medium change for RNA extraction and microarray analyses. FY Δpdr3 Δpdr1 cells containing either the pRS315‐PDR1‐3 or the pRS315 vectors were grown in glucose minimal media supplemented with adenine and uracil to an OD of 0.7–0.8.
The microarrays containing all the yeast open reading frames were obtained from the Toronto microarray center and from Hitachi Software. They were based on the principle of PCR products deposited onto a polyamine coated glass slide (Eisen and Brown, 1999). All experiments were performed at least twice except for times 2 and 4 h, which were performed only once. The detailed microarray protocols are available at our web site. A 2 μg aliquot of mRNA was used for each reverse transcription. The arrays were read using a genepix 4000 scanner from Axon and analyzed with the genepix 3.0 software.
We filtered data, excluding artefactual spots, saturated spots and low signal spots. Assuming that most of the genes have unchanged expression, the Cy3/Cy5 ratios were normalized using the median of all the ratios for each experiment. We clustered the data from 12 independent experiments using PCA module of J‐express (Dysik and Jonassen, 2001). The profiles of genes belonging to up‐ and downregulated clusters were visually checked and genes with non‐regular profiles were discarded. For motif search in the promoter (between –800 and +1) of the upregulated genes we used the Consensus (van Helden et al., 1998) module of RSA tools (van Helden et al., 2000). The consensus sequences found were then aligned and represented using Sequence logo (Schneider and Stephens, 1990). The cluster of Figure 2 was generated using Treeview (Eisen et al., 1998).
We are especially grateful to Andre Goffeau for continuous encouragement and for critical reading of the manuscript. We thank John Friesen for his advice and Elvira Carvajal, Sylvie Hermann, Stephane Le Crom and Marisol Corral‐Debrinski for their comments on the manuscript. Thank you to Julian Ghislain for grammatical corrections. We are grateful to Hitachi Software and to the Toronto Microarray Center for providing yeast microarrays. This study was supported by the Association pour la Recherche contre le Cancer (ARC No. 5691) and by MENRT.
- Copyright © 2001 European Molecular Biology Organization