Advertisement

Open Access

Evolution and regulation of cellular periodic processes: a role for paralogues

Kalliopi Trachana, Lars Juhl Jensen, Peer Bork

Author Affiliations

  1. Kalliopi Trachana1,
  2. Lars Juhl Jensen*,1,2 and
  3. Peer Bork*,1,3
  1. 1 EMBL Heidelberg, Meyerhofstrasse 1, Heidelberg, 69117, Germany
  2. 2 Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, Blegdamsvej 3b, Copenhagen, N 2200, Denmark
  3. 3 Max‐Delbrueck‐Center for Molecular Medicine, Berlin‐Buch, Robert‐Rossle‐Strasse 10, Berlin, 13092, Germany
  1. *Corresponding authors. Tel: +45 35 325 025; Fax: +45 35 325 001; E‐mail: lars.juhl.jensen{at}cpr.ku.dk or Tel: +49 6221 387 526; Fax: +49 6221 387 517; E‐mail: bork{at}embl.de
View Abstract

Abstract

Several cyclic processes take place within a single organism. For example, the cell cycle is coordinated with the 24 h diurnal rhythm in animals and plants, and with the 40 min ultradian rhythm in budding yeast. To examine the evolution of periodic gene expression during these processes, we performed the first systematic comparison in three organisms (Homo sapiens, Arabidopsis thaliana and Saccharomyces cerevisiae) by using public microarray data. We observed that although diurnal‐regulated and ultradian‐regulated genes are not generally cell‐cycle‐regulated, they tend to have cell‐cycle‐regulated paralogues. Thus, diverged temporal expression of paralogues seems to facilitate cellular orchestration under different periodic stimuli. Lineage‐specific functional repertoires of periodic‐associated paralogues imply that this mode of regulation might have evolved independently in several organisms.

Introduction

Gene duplication is a major evolutionary force (Ohno, 1970) facilitating development of morphological novelties (Bassham et al, 2008), adaptation to new environments (Hittinger & Caroll, 2007) and speciation (Scanell et al, 2006). Small scale (SSD) and whole genome (WGD) duplications provide the raw genetic material on which mutation and selection act to evolve new functionalities. Many duplicated genes have a short lifespan, as one of the two copies is either lost or degenerates and becomes non‐functional (non‐functionalization). In the relatively rare case in which both copies are retained in the genome, one copy can diverge and acquire a new function that is completely different from the ancestral one (neo‐functionalization), or the two duplicated genes partition the ancestral function (sub‐functionalization; Force et al, 1999). The last two scenarios can be achieved through changes in amino acid sequence (Merritt & Quattro, 2002) or through changes in gene expression patterns (Bassham et al, 2008). Sub‐functionalization and neo‐functionalization allow spatio‐temporal specialization and expansion of functionality, respectively, but it is usually difficult to draw a line between the two fates (He & Zhang, 2005). There are several studies of spatial sub/neo‐functionalization revealing the function of paralogues as either tissue‐specific (for example, in zebrafish, pax6a and pax6b paralogues are expressed in different tissues and both of them fulfil the functional role of mammalian pax6; Kleinjan et al, 2008) or even within a single cell compartment (for example, the paralogues of the mammalian COX7A family are expressed in either the mitochondrion or the Golgi; Schmidt et al, 2003). Although a few studies observe distinct expression profiles of duplicates during the developmental time scale (Bassham et al, 2008; Kleinjan et al, 2008), the role of duplicated genes in the temporal organization of the cell remains unclear (Gu et al, 2002; Wagner, 2002).

The temporal organization of a biological system—be that a single cell or an entire organism—is at least as intricate as its spatial organization. Clocks, rhythms and cycles are universal from unicellular to multicellular organisms and coordinate many intertwined biological pathways that respond to extracellular or intracellular signals, adapting the organism to periodically changing environments. The 24 h diurnal rhythm (circadian clock) controls many biological responses in animals and plants (Harmer et al, 2000; Panda et al, 2002). Similarly, in Saccharomyces cerevisiae, the cell cycle is coordinated with the ultradian rhythm, which is a robust 40 min (approximately) metabolic cycle that persists indefinitely when cultures are supplemented continuously with glucose (Klevecz et al, 2004). During this metabolic cycle, transcription is organized into redox‐state superclusters; for example, genes that are involved in DNA replication are transcribed in the reductive phase, suggesting a mechanism for reducing oxidative damage to DNA during replication. Apart from the ultradian rhythm, a 4–5 h yeast metabolic cycle that takes place under glucose‐limited conditions in budding yeast has also been reported (Tu et al, 2005). However, our analysis is focused on the ultradian rhythm, as yeast metabolic cycle‐synchronized culture is also synchronized inherently with the cell cycle (Rowicka et al, 2007), making it impossible to separate the two processes.

Here, we present the first systematic comparison of the genes that are transcribed periodically during the cell cycle, the diurnal rhythm and the ultradian rhythm. We observe that diurnal‐ and ultradian‐regulated genes are more likely to have cell‐cycle‐regulated paralogues than would be expected by random chance. As the respective functional repertoires of these duplicated genes in yeast, plants and animals are different, we conclude that gene duplication and subsequent sub/neo‐functionalization took place independently during evolution. This suggests that orchestration of cellular pathways under different periodic processes provides a selective advantage, and that use of temporal regulation of newly emerging paralogues in different contexts—that is, distinct cyclic processes—seems to be an efficient way in which to achieve this.

Results And Discussion

Identification of periodically regulated genes

Recently, there have been numerous efforts aimed at capturing the temporal profiles of various periodic cellular processes. Time‐course microarray experiments have provided much data on the global transcriptome of the cell cycle, and on diurnal and ultradian rhythms in plants, mammals and yeast (supplementary Table S1 online). We have previously identified 600, 400 and 600 cell‐cycle‐regulated genes in budding yeast, Arabidopsis and humans, respectively (Jensen et al, 2006). To maximize the comparability between data sets, we reanalysed the microarray experiments and identified diurnal‐regulated and ultradian‐regulated genes by using the same algorithm as the aforementioned cell cycle study (see Methods).

The identification of diurnal‐regulated genes is particularly complicated, as there is a high biological variance that should be taken into account. The genes that have been identified as diurnal in different tissues overlap only in part (Delaunay & Laudet, 2002), indicating a tissue‐specific regulation of diurnal genes that depends on the physiology of the tissue (Harmer et al, 2000; Panda et al, 2002). Unfortunately, it is not only biological variability that has to be considered. Only about 90 common genes (out of hundreds) were identified to cycle diurnally in the liver in two separate microarray experiments (Delaunay & Laudet, 2002), pointing to problems associated with microarray reproducibility. To eliminate the biological variance, we decided to average over many different tissues and experiments (supplementary Table S1 online). Benchmarks of the resulting lists against experimentally verified diurnal genes show that we obtained the best list by combining all available expression data across studies and tissues (supplementary Fig S2 online). We produced three further lists consisting of 600 ultradian‐regulated budding yeast genes, 600 diurnal‐regulated Arabidopsis genes and 600 diurnal‐regulated mouse genes.

Cell cycle and diurnal rhythm regulation of paralogues

Comparison of the Arabidopsis regulated genes under diurnal rhythm and the cell cycle reveals that only seven genes (supplementary Table S3 online) are expressed periodically in both processes, which is no more than what would be expected by chance alone. However, mapping the genes to a set of eukaryotic paralogous groups (see Methods) reveals that 18 diurnal‐regulated genes belong to paralogous groups with cell‐cycle‐regulated members, which corresponds to 3.4 times more genes (P<10−5; Fisher's exact test) than obtained by random expectation, after taking into account the total number of genes, the number of periodic genes and the number of paralogues of periodic genes (supplementary information online). Similarly, 26 cell‐cycle‐regulated genes have diurnal‐rhythm‐regulated paralogues (3.8‐fold enrichment; P<10−8; Fisher's exact test; supplementary Table S4 online). The diurnal‐regulated genes and the cell‐cycle‐regulated genes tend to be paralogues of each other (Fig 1A).

Figure 1.

Four‐way Venn diagrams of cell‐cycle‐regulated genes, diurnal‐regulated genes and their paralogues in Arabidopsis and humans. (A) There are 26 cell‐cycle‐regulated genes with ultradian‐regulated paralogues and 18 ultradian‐regulated genes with cell cycle paralogues (supplementary Table S5 online). (B) There are 15 cell‐cycle‐regulated genes with diurnal‐regulated paralogues and 15 diurnal‐regulated genes with cell cycle paralogues (supplementary Table S6 online). The number of cell‐cycle‐ or diurnal‐regulated proteins that do not have diurnal‐ or cell‐cycle‐regulated paralogues, respectively, are indicated in white circles. Within the dashed‐line white circles are proteins of paralogous groups with cell‐cycle‐ or diurnal‐regulated members that do not cycle themselves. The number of genes that are regulated in both cycles is indicated in the grey circles (supplementary Table S3 online). Both in Arabidopsis and humans, these genes are not significantly overrepresented in our periodic lists. The numbers of diurnal‐regulated genes with cell‐cycle‐regulated paralogues and vice versa are highlighted in black circles.

We observed the same trend when comparing the diurnal rhythm and cell cycle in humans. The cell cycle and diurnal rhythm analyses were based on human and mouse data, respectively. Assuming that at least one of the two processes is comparable between human and mouse, which should be the case for the cell cycle, we mapped diurnal‐regulated genes to their 491 one‐to‐one orthologues in human and mouse (supplementary information online). Indeed, 15 paralogue pairs have been detected that consist of cell‐cycle and diurnal‐rhythm‐regulated genes (2.5‐fold more than that by random expectation; P<4 × 10−4; Fisher's exact test; supplementary Table S4 online). We thus get a statistically significant result despite there being interspecies differences due to the rapid evolution of transcriptional regulation in mice and humans (Odom et al, 2007) and intraspecies differences between tissues, both of which weaken the signal. Besides the paralogous pairs, 22 genes are regulated during both the cell cycle and the diurnal rhythm in humans (supplementary Table S3 online). As was the case for Arabidopsis, this is not significantly more than that expected by chance (Fig 1B).

Currently, we cannot distinguish between sub‐functionalization and neo‐functionalization as we can posit two different scenarios: (i) a gene is regulated periodically in the phylogenetically older cell cycle and after duplication its functional properties can be extended to another cyclic process (neo‐functionalization) and (ii) a gene is regulated periodically under two periodic processes and its duplication enables two specialized temporal regulation profiles (sub‐functionalization). In either case, we propose that there was only one ancestral response to extrinsic and intrinsic periodic signals. After gene/genome duplication, the ancestral response could be expanded or could become specialized in time.

Parallel evolution in Arabidopsis and human

When analysing the paralogue groups that are regulated in more than one cycle, we observed that their functional repertoires are different in Arabidopsis (supplementary Table S5 online) and humans (supplementary Table S6 online) and are in accordance with their specialized biology. To exemplify this, we focus on two pairs of paralogues and how their temporal regulation is related to plant and animal physiology, respectively.

In Arabidopsis, for example, we find periodic regulation of alpha‐amylases during diurnal rhythm (AMY3) and cell cycle (AMY1) that does not occur in humans. Starch is prepared in chroloplasts during day‐time photosynthesis and is degraded during the night, providing sugars for leaf metabolism and exporting them to other organs such as seeds and the root (Smith et al, 2005). The diurnal regulation of AMY3 accompanies the diurnal (day/night) regulation of starch metabolism in leaves. The enzyme is targeted to the chloroplasts and participates in transitory starch degradation, although its exact role remains unclear (Zeeman et al, 2007). Cell‐cycle‐regulated AMY1, however, contributes to seed germination (Borisjuk et al, 2004). During germination, the cell is in rest in the G1 phase. Gibberellin is necessary to enter the S phase and complete cell division (Ogawa et al, 2003). Concurrently, gibberellin‐induced alpha‐amylase (AMY1) promotes degradation and mobilization of the starch accumulated in endosperm to fuel cell division (Fincher, 1989).

Among the 16 diverged regulated paralogues in humans, we identified, for instance, cell‐cycle‐regulated and diurnal‐regulated ribonucleotide (nucleoside 5'‐triphosphate; NTP) reductase subunits named RRM2 and RRM2B, respectively. These enzymes exemplify differential temporal regulation of isoenzymes. NTP reductase in mammals catalyses the reduction of ribonucleotides to deoxyribonucleotides, the balanced supply of which is essential for both accurate DNA replication and repair. NTP reductase consists of two non‐identical subunits (R1 and R2), and its enzymatic activity is regulated by R2 expression—that is, by RRM2 or RRM2B. RRM2 peaks during the S phase and is blocked during G1 phase, pointing to a mechanism protecting the cell against unscheduled DNA synthesis (Chabes et al, 2003). However, RRM2B (the diurnal‐regulated gene) is hardly expressed in proliferating cells. Recently, its role in DNA repair and mitochondrial DNA synthesis has been elucidated (Bourdon et al, 2007) in non‐proliferating cells. Both the above‐mentioned processes take place independently of the cell cycle, it has been reported that mitochondrial DNA synthesis cycles in the rat liver (Dallman et al, 1974).

As the functional repertoires of paralogues that have been sub/neo‐functionalized under the regulation of the cell cycle and the diurnal rhythm in Arabidopsis and humans are different, the most parsimonious scenario is that this mode of regulation has evolved independently in both organisms.

Periodic regulation of metabolism in yeast

The cell cycle in budding yeast is orchestrated with the ultradian rhythm. Recent studies have shown that the latter gates cells into the S phase of the cell cycle, organizes the energetic (redox) status of the cell, and coordinates mitochondrial and metabolic functions (Klevecz et al, 2004). Basic redox molecules such as NAD(P)H and glutathione are under the temporal control of the cell cycle and ultradian rhythm (Lloyd & Murray, 2007). The cellular redox balance is also vital for organization of the cell cycle and the diurnal rhythm (Matés et al, 2008; Lepisto et al, 2009). Similar to the diurnal/cell cycle results presented above, we can identify 58 paralogues that have diverged their regulation under the cell cycle and the ultradian rhythm (twofold enrichment, P<10−5; Fisher's exact test; supplementary Table S4 online). Besides paralogue pairs with divergent regulation, there are 64 genes that are expressed periodically during both the cell cycle and the ultradian rhythm (1.25‐fold enrichment, P<0.02; Fig 2).

Figure 2.

Four‐way Venn diagram of cell‐cycle‐regulated, ultradian‐regulated genes and their paralogues in budding yeast. The number of genes that are regulated by both cycles is indicated in the grey circle. The numbers of ultradian‐regulated genes with cell‐cycle‐regulated paralogues and vice versa are highlighted in the black circles (supplementary Table S7 online). There is an over‐representation of cell‐cycle‐regulated genes with ultradian‐regulated paralogues, and vice versa. The number of cell‐cycle‐ or ultradian‐regulated proteins that do not have ultradian‐ or cell‐cycle‐regulated paralogues, respectively, are indicated in white circles. Within the dashed‐line white circles are proteins of paralogous groups with cell‐cycle‐ or ultradian‐regulated members that do not cycle themselves.

Many recent studies have shown that the mode of duplication—that is, SSD compared with WGD—has an important role in the functional divergence of paralogues (Maere et al, 2005). S. cerevisiae is a degenerated tetraploid resulting from WGD after the divergence of Saccharomyces from Kluyveromyces, followed by extensive gene loss (Wolfe & Shields, 1997). We decided to explore the origin of cell cycle/ultradian paralogue pairs and ask whether there is a preferred mode of duplication for temporal sub/neo‐functionalization. Of a total of 416 paralogous groups (supplementary Table S10 online) in yeast that we identified (see Methods), we subtracted 651 WGD paralogues identified by Byrne & Wolfe (2005), leaving 248 SSD paralogues. For both WGD and SSD paralogues, we observed a significant over‐representation of cell‐cycle/ultradian‐regulated paralogues (supplementary Table S9 online). Although WGD contributes the highest number (31 periodically expressed paralogues), SSD cell‐cycle/ultradian‐regulated paralogues (a total of 27 paralogues) are more enriched compared with random expectation. Although a more detailed analysis is needed, this implies a stronger selection on SSDs. In any case, both modes of duplication contributed to a lineage‐specific functional repertoire of periodic divergent paralogues.

Functional analysis of cell cycle/ultradian paralogue pairs and their mapping to the metabolic network of S. cerevisiae (Fig 3) revealed that cell cycle/ultradian sub/neo‐functionalization has occurred frequently in paralogues that regulate important metabolic substrates, such as glucose, pyruvate and sulphate. For example, glucose is transported by the major facilitator superfamily of transporters (HXT), a few of which compose a cell‐cycle/ultradian‐regulated paralogous group. S. cerevisiae grows in a variety of glucose concentrations because of the presence of several HXT genes, which show glucose transport with dual kinetics (high‐glucose and low‐glucose affinity) and change their expression levels in response to culture conditions (Verwaal et al, 2002). Yeast proliferates fast in a glucose‐rich environment, wherein low‐affinity transporters are expressed (for example, the cell‐cycle‐regulated HXT2 gene), but the cell cycle slows down markedly after glucose depletion, upon which high‐affinity transporters are induced (for example, the cell‐cycle‐regulated HXT7 gene; Ozcan & Johnston, 1999). Trehalose and glycogen—reserve carbohydrates—have been reported to accumulate under low growth rate conditions. Interestingly, they have a dual role: their degradation maintains the ATP flux in S. cerevisiae when glucose deteriorates, but they can also fuel the cell to enter the S phase of the cell cycle when culture conditions improve (Silljé et al, 1999). The ultradian‐rhythm‐regulated HXT5 gene is not affected by glucose concentration in the environment, similarly to HXT2 or HXT7, but it is expressed highly during low growth rate (Verwaal et al, 2002). It has been suggested that HXT5 regulates the uptake of glucose for production of trehalose, which is in accordance with its ultradian role in balancing the redox (ATP) status and helping the cell enter the S phase. In contrast to the cell‐cycle‐regulated HXT2 and HXT7 genes, which sense their glucose‐sufficient environment and drive the culture to cell cycle—which is an energy‐demanding process—the ultradian‐regulated HXT5 gene senses the glucose‐insufficient environment and stores energy, indicating that temporal sub/neo‐functionalization accompanies functional divergence.

Figure 3.

Core metabolic network of cell‐cycle‐regulated and ultradian‐rhythm‐regulated genes in Saccharomyces cerevisiae. The core metabolic network of S. cerevisiae is shown in yellow. Reactions that are catalysed by cell‐cycle/ultradian‐regulated genes are highlighted with light blue and cell‐cycle/ultradian‐regulated paralogues are mapped with dark blue lines. Metabolic substrates that are under cell cycle and ultradian regulation are marked by black circles. It seems that there is a tight regulation of cell cycle and ultradian metabolism judging by the number of common regulated genes/products and the number of paralogues. A few of the common cell‐cycle‐regulated and ultradian‐regulated substrates, such as glucose‐6‐phosphate and sulphate, are important for glycolysis and redox equilibrium, respectively, in budding yeast. The custom metabolic map shown here was generated by using iPath (Letunic et al, 2008).

Conclusion

Here, we report, for the first time, that diverged temporal regulation under the cell cycle and diurnal or ultradian rhythm of newly emerged paralogues seems to be an efficient way in which to orchestrate cellular response to extrinsic and intrinsic signals. This temporal sub/neo‐functionalization of paralogues under the cell cycle and diurnal/ultradian rhythm occurs more frequently than expected by chance and spans different lineages (Arabidopsis, Homo sapiens and S. cerevisiae). As the functional repertoires of duplicated genes in the three organisms studied are different, it seems that the temporal sub/neo‐functionalization of duplicated genes has evolved independently in plants, animals and yeasts to distinguish cell‐cycle regulation from other periodic processes, perhaps even to coordinate those processes. Further analysis of the functional repertoires of cell‐cycle/ultradian‐regulated paralogues in yeast indicates that they have arisen through both WGD and SSD and that in the yeast linage are enriched in metabolic functions. Thus, we could show that a large‐scale (meta) analysis of duplications in several species reveals details on the evolution of cellular periodicity and provides the first insight into temporal sub/neo‐functionalization at the cellular level.

Methods

Analysis of microarray expression data and benchmarking. To enable a comparison of cell‐cycle‐regulated genes that have been identified previously (Jensen et al, 2006), we reanalysed all microarray expression time courses (supplementary Table S1 online) using the same permutation‐based algorithm (de Lichtenberg et al, 2005). The resulting lists of periodic transcripts during diurnal rhythm were benchmarked against lists of known diurnal‐regulated genes compiled from review articles and The Arabidopsis Information Resource database (supplementary information online). We kept the top 600 diurnal‐regulated genes both in Arabidopsis and mouse, as these lists capture 75–90% of the known diurnal and cell‐cycle‐regulated genes (supplementary Fig S2 online). In order to compare the cell cycle and diurnal rhythm genes in humans, we used 13,648 pairs of 1:1 human to mouse orthologues (Hubbard et al, 2007) to transfer the mouse list to human diurnal genes.

Identification of eukaryotic orthologous/paralogous groups. Human, Arabidopsis and budding yeast proteins were categorized into orthologous groups by an automatic procedure (von Mering et al, 2005) similar to the original cluster of orthologous groups procedure (Tatusov et al, 2003); all‐against‐all Smith–Waterman similarities were computed and orthology was then assigned through reciprocal best matches and subsequent triangular linkage clustering (von Mering et al, 2005). To perform within‐species comparison, we focused on paralogous groups resulting from the first step of orthology assignment (supplementary information online), after which 10,947 proteins were clustered in 3,761 paralogous groups (supplementary Table S2 online).

Supplementary information is available at EMBO reports online (http://www.emboreports.org).

Conflict of Interest

The authors declare that they have no conflict of interest.

Supplementary Information

Supplementary Information [embor20109-sup-0001.pdf]

Acknowledgements

We thank the members of the Bork group for helpful discussions. K.T. is grateful to Takuji Yamada and Ivica Letunic for help with iPath exploration. K.T. is supported by the European Union FP6 Program Contract number LSH‐2004‐1.1.5‐3. The work carried out in this study was supported in part by the Novo Nordisk Foundation Center for Protein Research.

References

Creative Commons logo

This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.

View Abstract