Synthetic biology: discovering new worlds and new words

The new and not so new aspects of this emerging research field
Víctor de Lorenzo, Antoine Danchin

Author Affiliations

  • Víctor de Lorenzo, 1 National Centre of Biotechnology–Spanish Council for Scientific Research (CSIC), Campus de Cantoblanco, Madrid, Spain
  • Antoine Danchin, 2 Antoine Danchin is at the Genetics of Bacterial Genomes, Institut Pasteur, Paris, France

In the year 1493, Christopher Columbus (1451–1506) returned from his famous voyage across the Atlantic Ocean with news of an unexplored land in the west. Most Europeans were convinced that Columbus had discovered a ‘new world’, yet it was not new at all. Some 400 years earlier, the Norse explorer Leif Ericson (circa 970–1020) had probably been the first European to set foot on North American soil, and some thousands of years earlier, the continent was populated by humans who had crossed the Bering Strait from Asia.

For many of its practitioners, the answer is clear: synthetic biology is about engineering and not about science…

The ‘discovery’ of America in the late fifteenth century came to mind when engineers, then at the Massachusetts Institute of Technology (MIT; Cambridge, MA, USA), started talking about a new discipline, which they called synthetic biology (Endy, 2005; Andrianantoandro et al, 2006). This term has subsequently evoked many great expectations, as its application might help to solve numerous social and environmental problems. However, it has also triggered the type of public alarm that molecular biologists are all too familiar with after the bitter debates about genetically modified organisms in the 1990s (Jansson, 1995; Ramos et al, 1994). So, can synthetic biology really be called a new field, or is it just the intensification of the genetic engineering of organisms that biologists have been carrying out since the 1970s? What is genuinely novel about this allegedly newborn discipline?

The term synthetic biology was coined in 1912 by the French chemist Stéphane Leduc (1853–1939; Leduc, 1912); however, it has only recently become an umbrella term to describe the interface between molecular biology and hard‐core engineering (Andrianantoandro et al, 2006). Synthetic biology is becoming an inclusive theoretical and technical framework in which to approach biological systems with the conceptual tools and language imported from electrical circuitry and mechanical manufacturing. This effort pursues the creation of new organisms by the rational combination of standardized biological parts that are decoupled from their natural context. In fact, the reliable formatting of biological functionalities and the detailed description of the most basic biological components and their interfaces, similar to modern electronic circuits, is one of the characteristics of the field.

The fundamental idea behind synthetic biology is that any biological system can be regarded as a combination of individual functional elements—not unlike those found in man‐made devices. These can therefore be described as a limited number of parts that can be combined in novel configurations to modify existing properties or to create new ones. In this context, engineering moves from being an analogy of the rational combination of genes—as in standard molecular biology and biotechnology—to becoming a veritable methodology with which to construct complex biological systems from first principles. The fusion between authentic (not metaphoric) engineering and molecular biology will certainly have far‐reaching consequences. Yet, to what extent is this realistic science? How much is genuinely new and how much is merely hype generated by rebranding?

According to long‐standing philosophical tradition, science is about knowing and understanding, whereas technology is about doing (Wolpert, 1998). So, in what realm does synthetic biology fall? For many of its practitioners, the answer is clear: synthetic biology is about engineering and not about science (Endy, 2005; Baker et al, 2006; Andrianantoandro et al, 2006). Yet engineers are not the only stakeholders as synthetic biology is attracting many researchers from fundamental science (Church, 2005) and companies and businesses, although their agendas are diverse (Fig 1).

…the present momentum for synthetic biology is a good opportunity to realize a common potential, find a shared language and identify synergies

Figure 1.

The pillars of synthetic biology. Disciplines in biology, biotechnology, engineering and computing interact to form the foundations of synthetic biology. In addition, research on the origin of life is experiencing a considerable rebirth (Luisi, 2006).

It is also possible to distinguish a unique European perspective, as many activities that now qualify as synthetic biology—protein design, modelling, metabolic engineering and biological nano‐engineering—have been going on for some time on the ‘old continent’. In fact, many European scientists are sceptical about calling synthetic biology a new field, as there is clear similarity—despite the different language—between the discourse on genetic engineering in the late 1970s and many of the claims and assertions made by synthetic biologists. However, these various biological fields have always been more implicit than explicit, fractionated and lacking a common descriptive language. By contrast, the present momentum for synthetic biology is a good opportunity to realize a common potential, find a shared language and identify synergies.

In our view, the key to fulfilling the promise of synthetic biology—in terms of both scientific and technological breakthroughs—is not societal acceptance or ethics, but rather understanding the biological building blocks that can be used for robust engineering, adopting a descriptive and quantitative language for biological transactions, and identifying and managing the physical and chemical constraints of any autonomous biological system.

Biological parts—the minimal biological elements that can be used for engineering—are one of the trademarks of ongoing efforts in synthetic biology (Canton et al, 2008; Arkin, 2008). The idea is both simple and attractive: in the same way that a machine can be disassembled and catalogued as individual components—such as hard disks, screens, keyboards and memory chips—living systems might also be broken down into a list of components that can be rewired for a specific purpose. This sounds like a straightforward engineering approach, but it might not be that easy. The functions of most extant biological systems—living entities—depend on the environment in which they thrive and the evolutionary pressures that have created a growing complexity of interaction at all levels. Furthermore, proteins seem to have an amazing ability to develop new interactions with other proteins as soon as they are subjected to selective pressure. We need a better conceptual framework to define and understand the minimal biological building blocks. Simply calling them Biobricks™ and regarding them as singular biological components—as in the MIT‐run catalogue of biological parts (—can give a misleading perception of the issues at stake. Furthermore, the nature and description of such parts depend on the scale of the engineering objective. Genetic circuits can be constructed using well‐defined promoters and reporters; however, designing a whole cell will require complete functional modules—for translation, energy generation, replication and so on—as building blocks. Similarly, whole cells will become the parts needed for the design of microbial communities, tissue engineering and so on.

The ultimate agenda of synthetic biology is to recreate a cell as an automaton that can algorithmically process information. To this end, we need to identify the various functions of a cell before compiling a list of the parts that implement them. An important point here is to avoid the trap of assuming any goal in such an automaton; all of its properties should be declarative, not prescriptive, and there are no built‐in instructions to tell the automaton what it should do.

The comparative analysis of living organisms should give us a list of the functions that are needed for life. Such a research programme, however, might look hopeless from an engineering perspective, as many different objects can fulfil the same function. Fortunately, evolution can help us to solve this problem: life evolves by ascending from earlier life forms, and any function that has emerged and has been implemented within or by a particular biological system becomes conserved over generations. This evolutionary ‘stickiness’ can be analysed by identifying persistent genes—those that are recurrently kept in a given number of genomes (Fang et al, 2005). By using persistent genes, which are by no means expected to be ubiquitous, it is possible to construct an initial catalogue of 400–500 functions that seem to be essential for life. Yet, many persistent genes have unknown functions, and we might miss others that are essential. For example, we might fail to identify functions that are associated with membranes, as the rules that define similarities between membrane proteins might be distinct from those for cytoplasmic proteins.

At least in the case of bacterial genomes, the global set of genes can be split into two categories: those that allow life and perpetuate it, and those that allow life in an environmental context. We call the class of persistent genes in the first category the paleome, the members of which constitute a list of minimal biological functions (Danchin et al, 2007). This is where we need to search for all components to be implemented in an artificial cell able to mimic the behaviour of living entities.

The quest for a minimal set of functions for a self‐maintaining system is not limited to synthetic biology. For some time, engineers have been working on a self‐reproducible three‐dimensional (3D) printer ( Their work shows that a Turing machine (Turing, 1937) could act as a model for a synthetic living system that would contain the machine itself; it would also require a separate programme to store a blueprint of how to assemble it. In addition, it would need a source of energy, transport systems to capture missing parts from the environment and lubricants to allow the movement of components. The experiences gained from designing a self‐reproducing printer provide several interesting lessons for the overall architecture of biological systems and the interactions between the parts. The take‐home message is that engineering biological systems involves much more than cutting and pasting DNA sequences of more or less characterized parts—even if one can build on a logical blueprint.

Every descriptive language, including those that are used to describe technical or scientific systems, is ultimately metaphorical; it carries a meaning and has an agenda (Danchin, 2003). Although molecular biologists often believe that their abstractions and representations—many of which are taken from physics—are the ultimate means to represent biological phenomena, their language might not be sufficient to fulfil the strong engineering agenda of synthetic biology. A robust language to describe engineering biological entities is needed, but it must also be based on sound biology. Simply renaming long‐standing concepts such as transcription or translation rates by using equivalent terms to echo signal‐transmission in electronic circuits might give a misleading perception. For example, several research groups in the USA ( have adopted the term polymerase per second (PoPS) to quantify the input/output signals in genetic circuits. PoPS describes the flow of RNA polymerase molecules along DNA—the current for gene expression—and the PoPS level is the number of molecules that pass through a specific position on the DNA per second. Similarly, ribosome per second (RIPS) refers to the flow of the translation machinery through messenger RNA (mRNA). There is little biology in these definitions; rather, they represent a straight and overtly simplistic projection of electric engineering concepts into supposedly biological counterparts.

This specific issue deserves some thought, as the challenge of describing and standardizing autonomous biological parts is not just academic. To achieve the engineering goals of synthetic biology, we need to adopt a consensus on robust ‘engineerable’ elements—such as the International Organization for Standardization (ISO) of metric standards that are now universally accepted. In this context, we need to start with a quantitative standardization of the signal transmission between these parts, such as the transcriptional activity of distinct promoters in vivo and their quantification in universal units. However, each scientist seems to have a favourite way of measuring such a value with all types of reporter genes or DNA chips, as well as a plethora of miscellaneous hosts, gene doses, media and temperatures, which must be replaced by unequivocal promoter strength units that engineers can use to calculate their circuits. This discussion must involve not only PoPS enthusiasts and synthetic biologists, but also experts in the fundamental aspects of transcription with all its intricacies.

We need a better conceptual framework in which to define and understand the minimal biological building blocks

The definition of transcription units and many other types of biological functions might eventually be subject to some governance in order to establish benchmarks. There are already discussions about the promotion of a European Institute of Biological Standards as a counterpart to the MIT‐run initiatives mentioned above. Yet, even if we have a set of standardized parts and functionalities, we might still lack the knowledge needed to rewire them—akin to writing a book with a well‐defined vocabulary but lacking the grammar. One possible solution—the only one available so far—is to use extant or synthetic genomes as sort of ‘grey box’ modules to form a biological chassis in which to implant characterized and predictable circuits (Gibson et al, 2008).

Eventually, we should be able to build whole biological systems from first principles. One interesting opportunity to achieve this might be offered by the natural mobile regulatory circuits that are present in integrons, phages, transposons and broad host‐range plasmids, which are evolutionarily selected for not depending on the biological context of the recipient (Frost et al, 2005; Mazel, 2006). This context‐free behaviour is called orthogonality in synthetic biology jargon, to echo equivalent properties in computing science. A typically natural orthogonal part is the T7 phage polymerase, which is able to transcribe genes under the T7 promoter sequence in most hosts. One example of artificial orthogonal systems is given by ribosome–mRNA pairs that can process information in parallel with, but independent of, their wild‐type counterparts (Wang et al, 2007). Using naturally occurring orthogonal systems and designing artificial context‐independent biological functions could improve the robustness of artificial genetic circuits to a point where they match the performance of electronic circuits; however, we are not there yet.

One intriguing question at the core of understanding and eventually refactoring living systems is the link between gene expression and growth. The only way to inhibit cell growth is by subjecting cells to nutrient limitation, antibiotic or other stress. The problem is how to maintain active cells without any associated growth. There have been attempts to create artificial vesicles that contain all the metabolic components of a cell but lack DNA (Noireaux & Libchaber, 2004). However, this might not be the ultimate solution because proteins age extremely fast (Fredriksson & Nystrom, 2006), sometimes within minutes, as their aspartate or asparagine residues isomerize (Shimizu et al, 2005). Some repair and turnover mechanisms are therefore needed for lasting performance. Perhaps we can learn some lessons from bacteria that manage naturally to be metabolically vigorous without much growth.

There is also the question of noise. Experiments have shown that one can construct cells with logical behaviours, but the stability of the circuit is always limited (Elowitz & Leibler, 2000). Noise and accumulating mutations are still formidable problems for even the simplest of engineered biological systems (Silva‐Rocha & de Lorenzo, 2008). Every synthetic circuit that is engineered to behave in a particular way seems to decay rapidly after a relatively short period. By contrast, existing gene‐expression programmes in nature allow signals to propagate faithfully through regulatory networks. This course can be affected by stochastic fluctuations such as variation in the pool of housekeeping proteins, typically RNA polymerase, if some of the elements of these circuits are present at low number, or by changes in environmental conditions (de Lorenzo & Perez‐Martin, 1996; Pedraza & van Oudenaarden, 2005; Qian, 2006). Yet, the intriguing question remains as to how individual cells keep regulatory noise within tolerable limits, as noise is intrinsically bound to molecular events. Although cells do gain from random fluctuations—mutation and evolution or induced amplification of signals—noise might destroy any biological circuit. Yet, the gene‐expression behaviour of cells seems robust, implying that bacteria are able to filter noise to avoid regulatory and metabolic chaos. What such filters are made of and how they work still need much clarification. A related question is how stochastic phenomena in single cells translate into population behaviour. More computational and experimental tools are needed to address this crucial issue.

Biological entities are not only prone to become interdependent, but also evolve in unpredictable ways as they are subjected to the cycle of mutation/amplification/selection that is intrinsic to evolution. The implantation of extra DNA into a cell and the encoded proteins are severely counter‐selected over time if they cause any burden to cell physiology. This is hinted at by the long period of time that horizontally transferred genes take to develop regulatory interactions (Lercher & Pal, 2007) and by the problems encountered when transferring genes with products that belong to multi‐protein complexes (Sorek et al, 2007). The practical downside of these biological phenomena is the difficulty involved in stably programming bacteria with genetic circuits or through heterologous expression of regulatory modules. Bacteriophages that had been redesigned to behave in a more logical way (Chan et al, 2005) made smaller lysis plaques than their wild‐type precursors and might eventually evolve to erase the human construction parts.

…engineering biological systems involves much more than cutting and pasting DNA sequences of more or less characterized parts—even if one can build on a logical blueprint

We therefore need to explore how to avoid or decrease undesired evolution. One possibility might be to use endogenous DNA‐repair systems to keep the fidelity of the instructions encoded in the implanted DNA. One can also think of engineering minimum interference within the host by means of orthogonal parts. Ultimately, it is a question of whether an alternative information‐coding molecule and the corresponding expression machinery can be made to be less amenable to mutation than DNA. One could think about the other extreme and create highly evolvable biological modules with a capacity to nest rapidly in a pre‐existing regulatory network (Silva‐Rocha & de Lorenzo, 2008), which is reminiscent of the programmes that install new software on the operating system of a computer.

Many synthetic biologists adopt the implicit or explicit metaphor of the cell as a complex mechanical machine, which requires relevant sub‐machines to organize itself, including scaffolds. How can we identify these components? A remarkable feature of the paleome is that these genes are systematically coded in the leading replication strand, which shows that there is strong selection pressure to avoid conflicts between transcription and replication (Fang et al, 2005; Rocha & Danchin, 2003). It is therefore important to compile a list of the corresponding objects, which, in engineering terms, would be sub‐machines. A general way to identify these complexes is to analyse groups of co‐evolving genes in the paleome, such as the genes that determine the construction of the ribosome, for instance. Another example would be the ‘transcription nano‐machine’ possibly coupled to the ribosome, the ‘replication nano‐machine’ or the ‘nano‐machinery’ that shapes the cell and organizes its division.

Comparative phylogenies of such ‘machines’ reveal unexpected features. For example, the phylogenetic tree of the bacterial mur–fts gene cluster, which encodes the core components of the cell‐division machinery, does not parallel that of the ribosome, but, rather, that of the shape of the bacteria (Tamames et al, 2001). Another essential component stands out from the analysis of persistent genes: RNA degradation is generally organized through the degradosome, which is a loosely organized structure that couples the degradation of RNA with energy recovery (Danchin, 2008). Numerous studies of protein–protein interactions indicate that many other, perhaps less identifiable, molecular machines with distinct functions exist. However, there are important components missing from such an approach. For example, proteins that exchange compounds with the outside medium are not readily distinguished in the paleome. To identify such membrane proteins, we need more robust approaches to analyse orthologies. Similarly, the biological equivalents of lubricants for the self‐reproducible 3D printer mentioned previously are largely unknown and we are not even certain of how we could reveal such functions.

It is worth noting that a large proportion of the Biobricks™ deposited in the MIT database are regulatory components for constructing logical gates and genetic circuits. However, regulation does not seem to be a core component of the essential or minimal system, either in a paleo‐cell or in a 3D printer, except within the black box that controls the reproduction of the apparatus. The emphasis on regulation and its hierarchical language is probably a human bias. Instead, we need to develop a structured language that describes the functions that create a cell in terms of architecture and dynamics. Various ontologies exist, but they are certainly not inspired by an engineering concept. While we come to some robust understanding of the transition between non‐life and life in biological systems (Rasmussen et al, 2003), we argue that distinguishing between the machine and the programme, and establishing a list of functions and the molecular machines that perform them—instead of mere DNA sequences—will be crucial for synthetic biology.

There is a third factor: metabolism. Although the genome provides a complete catalogue of genes, it is not yet possible to get a complete list of the metabolites of a cell by analysing its genome. However, metabolic transactions impose a chemical and energetic framework on the cell—a sort of inescapable background economy. Although the links between the transcription and translation of mRNA in the ribosome are well known, the organization of metabolism and its influence in controlling cell activities are much less clear. Allosteric regulation of enzymes by intermediate metabolites, which was an important topic of biochemical research in the 1960s and 1970s, was largely abandoned in favour of transcriptional regulation by protein factors and signal molecules. How metabolites interface with the protein machinery that controls genetic networks is largely unknown, but this topic is certainly relevant for engineering biological circuits.

…the roadmap to engineering biological systems is determined not by the biological parts but rather by how they interact

Although our comprehension of cellular metabolism is constantly improving (Feist & Palsson, 2008), we still lack an understanding of the metabolic fluxes within the cell. One such example is the link between translation and uridine diphosphate (UDP) biosynthesis (Fig 2). In general, it is difficult to unravel the relative organization of individual molecular machines such as ribosomes, the cell envelope, the DNA polymer and the multiple mRNA threads within the cytoplasm that are all linked by metabolites. However, this should not deter us from pursuing the agenda of synthetic biology. Such problems are perhaps not so different from the challenge of engineering an airplane, in which hundreds of kilometres of cables, the circulation of kerosene, the maintenance of a correct atmosphere and temperature, control panels and devices, seats, lights and so on, must all be organized.

Figure 2.

Translation and uridine diphosphate biosynthesis. The gene cluster tsf–pyrH–frr, which encodes the translation elongation factor (EF‐T), uridylate kinase (UMK) and the ribosome‐recycling factor (RRF), is highly conserved in bacterial genomes. However, translation apparently does not use uridine triphosphate (UTP) in any of its known reactions or its regulation. In bacteria, UMK is found in close association with the bacterial envelope. How and why is this activity related to ribosome recycling? We expect that ribosomes have to recycle in the terminus of the last gene of every operon. This region is generally located downstream from a 3′‐region of the messenger RNA that forms a so‐called Rho‐independent stem and loop structure. These terminators are uracil‐rich and must therefore consume a considerable amount of UTP, yielding uridine diphosphate (UDP). One could conjecture that UTP regulates RRF and that the transcription of operons terminates at regions not far from the membrane. 30S and 50S subunits assemble at the 5′ end of the mRNA, which is pulled and translated through the ribosome. At the end of a cistron, the 70S ribosome can immediately begin to translate the next cistron, unless it encounters the formation of a Rho‐independent stem and loop structure, which is terminated by a poly(U)‐rich tail. The local synthesis of the poly(U) therefore depletes UTP bound to the RRF, which can bind to the 70S ribosome, promoting its dissociation into 30S and 50S subunits.

Therefore, we advocate the metaphor of the cell as an algorithmic machine, rather than a mechanical one, and the use of machine‐orientated engineering language to implement synthetic biology. Under this scheme, the roadmap to engineering biological systems is determined not by the biological parts but rather by how they interact. As is the case for the 3D printer, the relationships between the objects—and not necessarily the objects themselves—are crucial to any attempt to construct a synthetic cell with non‐natural properties. This is implicitly accepted by each suggestion to replace a given biological part—by using amino acids that differ from the 20 natural ones to construct proteins, for example. Synthetic biology would then stand for symplectic biology (from the Greek meaning to weave together), which would combine the efforts of systems biology with engineering biology.

As mentioned above, any serious synthetic biology has to be based on the premise that the programme can be separated from the machine, as can be shown by genome transplantation (Lartigue et al, 2007). However, we need to take into account the physico‐chemical scenario that constrains both the machine and the programme. Some have argued against the computer model by stating that in a biological machine, it is not possible to distinguish fully between the hardware and the software. However, the same holds true for real computers. For example, if a programme stored on a compact disc (CD) drives a computer, and the CD is deformed, then, despite the fact that the programme it carries is unaltered, it will no longer be usable by the computer. This does not alter the abstract laws that establish what a computer is, but it does tell us that in a real implementation of the Turing machine, one cannot completely separate the hardware and the software. In synthetic biology, we might face the same challenge when synthesizing an artificial chromosome, as it might fail to behave as expected.

These observations point to a major issue that has not been generally raised, although it has been discussed by engineers: even if we construct a synthetic cell, its functioning will make it age and wither (Nystrom, 2003, 2007). Again, a careful analysis of the paleome might help to solve this problem. Perusal of the most persistent genes shows that they are apparently dispensable for colony formation in the laboratory (Fang et al, 2005); most encode functions that are involved in maintenance and repair, and are therefore involved in the perpetuation of life rather than in allowing life per se. We believe that this is an essential feature of living organisms that needs to be taken into account when constructing synthetic cells.

Indeed, the prospect of making cells à la carte for industrial production calls for robust constructs that can easily be scaled up to large production volumes by cell divisions over many generations, without altering the properties of the cell and/or the decoupling of growth from catalytic performance. The separation of the paleome into two main functionalities is reminiscent of the necessary distinction between the perpetuation/construction and reproduction/replication of life (Dyson, 1985). Although the latter makes life possible but accumulates errors, the former can teach us how to program long‐lasting synthetic cells, which, in a more human‐oriented application of synthetic biology, could provide us with an ‘elixir of eternal youth’. In any case, we have just started to explore the exciting scientific and technological prospects of synthetic biology.

Antoine Danchin is at the Genetics of Bacterial Genomes, Institut Pasteur, Paris, France. E‐mail: antoine.danchin{at}


Much of the content of this article reflects discussions between the authors and their partners in the EMERGENCE and PROBACTYS consortia of the Seventh Framework Programme of the European Community. The authors are grateful to Sven Panke (Swiss Federal Institute of Technology, Zürich, Switzerland) and Vitor dos Santos (Helmholtz Centre for Infection Research, Braunschweig, Germany) for stimulating debates, and to Eric Fourmentin for pointing out the RepRap 3D printer effort.


Víctor de Lorenzo

Víctor de Lorenzo is at the National Centre of Biotechnology–Spanish Council for Scientific Research (CSIC), Campus de Cantoblanco, Madrid, Spain. E‐mail: vdlorenzo{at}