The media took fascinated note last May when genome researchers started gambling on the number of human genes. One trend was apparent immediately: the estimates of human gene number—made by the people who ought to know if anyone did—were low compared with those made just a few years ago, when the number often tossed around was 100 000, give or take. For some weeks, the median GeneSweep bet (http://www.ensembl.org/genesweep.html) stood at just under 54 000. In June this year, Nature Genetics acknowledged the same trend. It published three major papers estimating human gene number. Two of the three assessments were very low: ∼34 000 and 30 000 genes.
With nearly 300 wagers recorded as of early August, the GeneSweep estimates have trended upward again, which may have something to do with the announced completion of a working draft of the human genome sequence at the end of June. At that time, the National Human Genome Research Institute (NHGRI) confirmed the existence of 38 000 predicted genes. So it appears that anyone who bet below that is out of luck. Although there are contrary opinions, of course, the luminaries are at the head of the parade. NHGRI's director Francis Collins has joined the accelerating trend to lowball guesses; his wager is 48 011. David Baltimore, perhaps taking his cue from the project leader, wrote in The New York Times that 50 000 seemed about right to him.
Can it be that our illustrious species has only two or three times as many genes as a transparent worm which dwells in the dirt beneath our feet?
For the purposes of GeneSweep wagering, a gene is defined as a protein‐coding sequence. Alternatively spliced transcripts are counted as one gene. Yet the numbers wagered are disconcerting, especially when compared with the gene count of ∼19 000 for the recently completed DNA sequence in Caenorhabditis elegans. Can it be that our illustrious species has only two or maybe three times as many genes as a transparent worm the size of a pinhead which dwells in the dirt beneath our feet? If this humiliating news is true, how did we get to be so majestic, so charming, so complicated? And what does it suggest about future directions for genome research?[Link]
The limits of clinical genomics
Veronica van Heyningen's work with Pax‐6 has made her an eloquent advocate for melding clinical investigations with laboratory research if we hope to lay bare our genetic complexity. ’We can learn an awful lot about how the genome works just by observing variation in humans, including everything from moderate to severe disease. Some of it is not even disease, but just variation,' she notes.
Genetics researchers have long advocated intensive study of single nucleotide polymorphisms (SNPs) to explain human variation. SNPs are, of course, important for the study of human disease, van Heyningen concedes. But her discovery that a control region 150 kb away from the Pax‐6 gene can determine whether someone is born with an iris or not reminds us of the very elastic definition of a gene.
van Heyningen is convinced that only a multi‐pronged research approach using several methodologies will do. ‘The idea that control can be so long‐range is something that many people who are setting out to work with SNPs are not taking into account. It might be very difficult to say which change in a nucleotide is the functionally important one, because it could be a long way away from the gene whose function it's affecting,‘ she says. ’Which is why we have to take many different directions to try to resolve what we mean by a gene and its whole region of influence—or the region that influences its expression.’
Tabitha M Powledge
When Brent Ewing and Phil Green of the University of Washington in Seattle presented their low estimates of 35 000 genes in Nature Genetics, they speculated that the complexity of vertebrates is traceable to diversification of regulatory networks or alternative splicing, rather than to sheer gene number. If the low gene numbers for the human genome hold true, researchers will have to change their view of how genetic information is converted into diversity. It will also mean that in silico approaches to annotating the human genome may never reveal its entire complexity. These outcomes seem plausible enough, considering that an estimated 30% of human genes can be spliced to yield different forms of the protein product.
Paula Grabowski from the University of Pittsburgh has been pursuing alternative splicing, especially its regulation, for years. She and her colleagues investigate a number of genes in the nervous system—the place where most alternative splicing occurs—in order to uncover expression patterns and regulation strategies. One object of their study is a subunit of the receptor for the neurotransmitter GABA. The neuron‐specific exons of the gene are spliced differently in different types of nerve cells during the same course of development.
In silico approaches to annotating the human genome may never reveal its entire complexity
‘The take‐home message is that it's extremely complex,‘ she says. ’The mode of regulation involves mechanisms that activate certain splicing events, and those mechanisms are intertwined with repressive modes of regulation. The sequences that are signals for this—to activate or to repress—are often interdigitated, and they bind many different factors.‘ The regulatory machinery, she says, seems to be modular: ’It is intricate and highly flexible. We see differences in regulation from cell type to cell type, from one brain region to other brain regions, during development and at other times.'
Grabowski has observed new interest in alternative splicing and its regulation especially among researchers studying neuromuscular diseases like amyotropic lateral sclerosis and neurodegenerative diseases like dementia. ‘In these diseases there's not a mistake in splicing per se, but the alternatively spliced isoforms are altered in their ratio. That strongly suggests that there's misregulation’, she notes.
Current distribution of Genesweep bets. Numbr of votes is 281.
The question of balance between two forms of a protein appears to have wide implications. Most cases of Wilms‘ tumor (a childhood cancer of the kidney), for instance, are caused by a mutation of WT1, a tumor suppressor gene essential for development of testis, ovary, heart and other organs. But there are Wilms’ tumor patients whose only problem appears to be an alteration in the balance of two alternatively spliced WT1 transcripts, according to Veronica van Heyningen of the MRC Human Genetics Unit at Western General Hospital in Edinburgh. WT1 is also an example of how alternative splicing can explain the increasing complexity of organisms. According to van Heyningen's colleague Nick Hastie, human WT1 codes for at least 24 different proteins. The zebrafish version of the gene makes only two, so evolution has been able to create complexity by figuring out ways to drive WT1 into diversifying.
It is not clear, however, that a gene must create a great many protein isoforms in order to do a great many things. van Heyningen works mostly on Pax‐6, which, among other things, regulates eye development in species ranging from fruit flies to humans. When one copy of the gene is missing, the result is an eye disorder called aniridia—complete absence of the iris. So far, the Pax‐6 gene is known to make only two protein isoforms, although there may be other forms too. Here, too, the ratio between isoforms may be significant. Alternative splicing adds or removes an extra 14 amino acids; van Heyningen and her colleagues are attempting to make mice with either one or two copies of the gene that include the extra 14 amino acids. They want to look separately at the effects of each form of the protein, but also at the effects of upsetting the balance of the two forms.
But there is more to Pax‐6 than simple alternative splicing. Other levels of control regulate its temporal and spatial expression. Different enhancers seem to act in different tissues and probably bind a variety of regulatory proteins. In the eye, the gene is expressed in various cell types at different times during development, and then continues to be expressed in the adult retina, lens cells and cornea. The retina is brain tissue, so it is not surprising that Pax‐6 is expressed in brain development as well: in the cortex, the thalamus and the cerebellum, among other places. Researchers at The Salk Institute for Biological Studies in California and the Max‐Planck‐Institute for Biophysical Chemistry in Göttingen, Germany, reported in the 14 April issue of Science that Pax‐6 helps to organize the mammalian neocortex into functionally specialized areas for sensory processing and motor control. To complicate the picture, Pax‐6 confers this identity on cortical cells in co‐operation with another gene, Emx‐2, and the two appear to be regulated entirely separately.
Evolution has been able to create complexity by figuring out ways to drive genes into diversifying
Grabowski thinks that there is a lesson here for devotees of the currently fashionable in silico approach to genome investigation, and cautions that algorithms alone will not tell us everything we want to know about what genes do. ‘We'll have to develop new methods and use some old ones also,’ she says. ‘The in silico approach by itself is limited in asking what are the natural gene functions and how are genes regulated.’ Grabowski is looking forward to the emerging field of proteomics: ’People are really working hard to develop ways of taking whole batches of transcripts from cells or tissues and asking how one batch is different from another. Once you know that there are 3000 genes that are upregulated in a certain cancer relative to a non‐cancerous tissue, then you can ask questions about the genes.’
The complexity of genes like WT1 and Pax‐6—and the unknown number of their fellow genes whose actions are likely to be just as intricate—induces a bit of humility. No wonder that, in contrast to much of the media, Human Genome Project officials have been so cautious in forecasting how quickly medical applications will be forthcoming from knowledge of the human genome. At the press conference following the announcement of a working draft of the genome, someone asked Celera's Craig Venter when we would know all about the human genome. Venter said he thought it would take most of this century. He may have been optimistic.
- Copyright © 2000 European Molecular Biology Organization
The author is a freelance science writer in the USA. E‐mail: