Biomolecular computing, or ‘computations conducted by biomolecules,’ is both theoretically and technologically challenging established ways to computation. The study of natural and artificial molecular computations is adding to our understanding of biology and computer science well beyond the framework of neuroscience. It is often placed within the broader context of ‘natural’ or even ‘unconventional’ computing. The articles in this special theme represent only a small part of Europe’s growing commitment in this massive endeavour. In this introduction, I’d like to lay out the current state of the field and lay out some of the core arguments for why biomolecular computation is important to both computer science and biology.
Abstract
The key disciplines in the design and implementation of biological computing devices are biomolecular computation and synthetic biology. This article looks at some of the most important works on logical devices that analyse biological data. There are two primary methods to their design and construction. The first method creates computing devices based on nucleic acid characteristics, while the second method focuses on genetic regulatory networks. DNA self-assembly, DNA automata based on restriction enzymes or deoxyribozymes, and logic circuits based on DNA strand displacement are examples of nucleic acid-based approaches. NOT, AND, and OR logic gates, a genetic toggle switch that acts as a biological memory unit, and numerous genetic oscillators that act as biological clocks are all examples of how genetic networks are used.
DNA Computing has a fundamentally distinct advantage over all other methods of computation, including quantum computing, as well as justifications for not dismissing it as restricted to exhaustive search.
Introduction
Biomolecular computers solve challenges connected with the evolution of traditional, silicon-based computers, particularly their shrinking, as implied by the Heisenberg uncertainty principle, and data transmission constraints by the central processing unit to and from the main memory (Amos, 2005). Adleman (1994) made the first attempt to create a DNA computer by solving basic computational problems in a laboratory test tube. Several reports on DNA computing surfaced throughout the next two decades. Some research has concentrated on well-known issues in mathematics and computer science, such as the tic-tac-toe algorithm (Stojanovic and Stefanovic, 2003), the Knight problem (Faulhammer et al., 1999), or the SAT problem (Stojanovic and Stefanovic, 2003). (Lipton, 1995).
Other areas of research, such as cancer therapy (Benenson et al., 2004) or’miRNA’ level diagnostics, have attempted to employ DNA computing in medicine (Seelig et al., 2006). The creation of biomolecular solutions for well-known models in theoretical computer science, such as finite automata, pushdown automata, or Turing machines, has been an interesting trend in DNA computing. Although some of this research only provided theoretical solutions without practical laboratory implementation, such as biomolecular representations of the Turing machine (Rothemund, 1995) or the pushdown automaton (Cavaliere et al., 2005; Krasinski et al., 2012), there were notable exceptions, such as a stochastic automaton (Adar et al., 2004) and a finite automaton (Adar et al., 2012). (Benenson et al., 2001, 2003).
The hardware in these early DNA computers was a restriction enzyme (RE), while the software and input/output signals were DNA fragments. The DNA computer functions biochemically by cutting and connecting DNA molecules in a sequential manner using RE FokI and DNA ligase. These DNA computers are nondeterministic finite automata, which are devices that can solve simple computational tasks. Benenson et al. (2001) created and developed a model of a nondeterministic two-state two-symbol (Figure 1A) finite state automaton — the most basic computer model (Hopcroft et al., 2001).
The concept of molecular systems performing calculations isn’t new, and it was more natural in the pre-transistor era. Most computer scientists are familiar with von Neumann’s late 1940s talks of self-reproducing automata, some of which were couched in molecular terms. The main issue here was bootstrapping: can a computer build a machine that is more complicated than itself?
The idea that a device’s computations might affect the device itself was important, but it seemed less natural in today’s age of hardware-software dichotomy. At the scale of molecular reactions, this concept is natural, while it may appear utopian to people in charge of large chip manufacturing plants. In his research on self-structuring in molecular and biological systems, Alan Turing looked beyond pure symbolic processing to natural bootstrapping mechanisms. Ross and Hjelmfelt have proposed pure chemical computers as an extension of this approach. Starting with the unravelling of the genetic code and translation machinery, the concept of molecular information processing spread to genetic regulation, cellular signalling, protein trafficking, morphogenesis, and evolution in biology, all of which occurred independently of advances in neuroscience.
For example, I formed the first multidisciplinary Department of Molecular Information Processing in 1992 because of the important significance of information processing in evolution and the possibility to address these concerns on laboratory time scales at the molecular level. Adleman’s fundamental experiment, published in 1994, demonstrated that laboratory molecular biology methods could be used to programme calculations with DNA in vitro. DNA’s enormous information storage capacity and low energy dissipation have sparked a surge in interest in massively parallel DNA computing. For genuine proponents of the discipline, however, brute search using DNA was never a viable solution to the problem of infinitely increasing the number of potential solutions.
As many contributions and patents have demonstrated, DNA Computing is not restricted to basic algorithms or, as we argue here, to a specific hardware architecture.
DNA COMPUTING
Universal computation and complexity results for DNA Computing exploded after 1994. (recent examples of ongoing projects here are reported in this collection by Rozenberg, and Csuhaj-Varju). The laboratory processes for modifying DNA populations were codified, and new sets of primitive operations were proposed: the link between recombination and so-called splicing systems was particularly intriguing, since it reinforced the concept of evolution as a computational process.
DNA Computing can now be divided into three categories: intramolecular, intermolecular, and supramolecular. DNA Computing approaches can be classified as homogeneous (well stirred) or spatially structured (multi-compartment or membrane systems, cellular DNA computing, and dataflow like architectures using microstructured flow systems), as well as in vitro (purely chemical) or in vivo (both chemical and biological) (ie inside cellular life forms). The emphasis is on achieving new basic operations, novel architectures, error tolerance, evolvability, or scalability, and approaches differ in terms of programmability, automation, generality, and parallelism (e.g. SIMD versus MIMD).
The intramolecular DNA Computing project, led by Hagiya, aims to build programmable state machines in single DNA molecules that work via intramolecular conformational changes. The main kind of intermolecular DNA computing, of which Adleman’s work is an example, focuses on the hybridization of distinct DNA molecules as a basic stage of calculations, and this is common to the three studies mentioned here that have an experimental component (McCaskill, Rozenberg and Amos). Beyond Europe, the University of Wisconsin is a leader in employing DNA Chips to provide a surface-based method to intermolecular DNA computing. Finally, Eric Winfree pioneered supramolecular DNA Computing, which uses the self-assembly of stiff DNA molecules with varied sequences to do computations. The link between nanomachines and nanosystems is so obvious, and will likely become more widespread in the near future.
The breakthrough expands on the researchers’ previous demonstration of a biomolecular computer, which is made up of a two-state system made up of biological components (DNA and a restriction enzyme). The computer, which works in vitro, begins with the state Yes. The computer checks one illness indication at a time throughout each computation step. If all of the disease indicators are present, the computation concludes in the Yes state, indicating a positive diagnosis; if at least one disease sign is not identified, the computation ends in the No state, indicating a negative diagnosis.
Shapiro’s team previously shown that using mRNA expression levels and mutations, this biomolecular computer could detect illness indications. The researchers have improved the computer’s ability to recognise illness markers from miRNAs, proteins, and tiny molecules like ATP in the current study. Simultaneously, the computer’s detection approach is more straightforward than before, involving fewer components and interactions with disease signs.
As the researchers explain, seeing a combination of illness indicators is far more valuable than sensing just one since it provides for more accuracy and sensitivity to disease changes.
They point out that in the case of thyroid cancer, the presence of both the protein thyroglobulin and the hormone calcitonin can lead to a far more accurate diagnosis than if only one of these disease signs is found.
Although the capacity to identify many illness signs is a significant step toward in vivo bimolecular computers and programmable medications, there are still many challenges ahead for researchers.
Biomolecular finite state machine
Benenson et al. (2001) suggested a biomolecular finite state machine that used molecules and DNA processing proteins to execute the above architecture of computing. One restriction enzyme (FokI), DNA oligonucleotides as transition molecules, input signals, and T4 DNA ligase were used in the laboratory implementation of this DNA-based computer. FokI, a restriction enzyme, recognised the GGATG sequence (all DNA sequences are presented in the 5′ 3′ orientation unless otherwise noted) and cut double-stranded DNA asymmetrically. Double-stranded DNA molecules of six base pairs in length encoded the automaton’s two symbols (a and b) plus the terminator t that signifies the end of the word.
Each of the input molecules featured a FokI recognition site and represented the input word a and b. They also had flanking sequences for binding the enzyme and detecting the computation’s end state. FokI’s single-stranded overhangs in the input molecule indicated not only a symbol, but also a machine state.
DNA transition molecules, which contain the FokI recognition sequence, spacers, and sticky ends of a length characteristic for FokI, were used to write the software (transition rules). Each transition molecule was made up of four parts: DNA sequences made up of nucleotides labelled p 1, p 2, p 3, and p 4. Single-stranded DNA made up component p 1 of a transition molecule, while double-stranded DNA made up portions p 2, p 3, and p 4. Each portion of the transition molecule was encoded by a specific nucleotide sequence with a specific length: k for p 1, l for p 2, m for p 3, and n for p 4.
In a biomolecular finite automaton, the initial part of a transition molecule, p 1 (a sticky end), was complementary to the single-stranded section of an input molecule and denoted a pair state, symbol >. The second portion, p 2 (a spacer part), allowed the cutting depth into the input molecule to be controlled. The third portion, p 3 (a restriction site), contained the nucleotide sequence specific for a specific endonuclease and allowed this restriction enzyme to function. Because restriction enzymes cut long DNA molecules better, the last component, p 4 (an extra part), assisted ligation of the restriction enzyme to the entire DNA molecule. The restriction enzyme cleavage site was missing from parts p 1, p 2, and p 4.
Materials and Procedures
Artificial DNA
Genomed created synthetic DNA sense () and antisense () oligonucleotides (200 nmol, lyophilized) (Warsaw, Poland). The oligonucleotides were utilised to create appropriate-length double-stranded DNA molecules with sticky ends for software input and output. The section ‘Construction of DNA computer elements’ below shows how to make a DNA molecule that represents the input word abba using sense () and antisense () oligonucleotides.
The oligonucleotide sequences for construction of the input molecules (input word abba) were abba(α): 5′-AATTCTAACGCGACTAATCAGCATCAGCCGACTATATTAGTTGTCATCGC-3′, and abba(β): 5′-GGCCGCGATGACAACTAATATAGTCGGCTGATGCTGATTAGTCGCGTTAG-3′. The oligonucleotide sequences for the detection molecule were: detect(α): 5′-AATTCGTTTATTAGTTGTCATCGC-3′ and detect(β): 5′-GGCCGCGATG-ACAACTAATAAACG-3′. The oligonucleotide sequences for the software (transition molecules) were: T122(α): 5′-AATTACTACTGTA CCCTAGTTATTAGTTGTCATCGC-3′
By annealing pairs of oligonucleotides, the transition molecules, input molecule, and detection molecule were created: abba() and abba(), detect() and detect(), T122() and T122(), T162() and T162(), T107() and T107(), and T24() and T24(). Additional sticky ends (AATT and GGCC) on annealed pairs of oligonucleotides allowed DNA molecules to be inserted into LITMUS 38i plasmids (see ‘Construction of DNA computer elements’).
Enzymes
New England Biolabs provided the restriction enzymes AcuI, BaeI, BbvI, MboII, BtgzI, and T4 DNA ligase (Ipswich, MA, USA). Fermentas Thermo Scientific provided T4 polynucleotide kinase (PNK) (Grand Island, NY, USA).
Plasmid vectors and chemicals
Fermentas Thermo Scientific provided LITMUS 38i plasmids. Axygen provided the plasmid miniprep and gel extraction kits (Union City, CA, USA). EurX provided the Perfect 100 bp DNA ladder (Gdansk, Poland). There were 13 bands on this ladder, with fragment sizes of 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1500, 2000, and 2500 bp.
The 500 bp and 1000 bp bands are brighter than the other bands on the ladder for easy reference. Sigma-Aldrich provided all other chemicals and bacterial media (St. Louis, MO, USA).
Conclusions and future prospects
The DNA computer built by Ehud Shapiro’s group at the Weizmann Institute of Science had a major flaw: its complexity. The number of states hindered their ability to scale up their DNA computer. As a result, we concentrated our efforts on developing a more complex DNA computer — a multistate finite automaton. Endonucleases like FokI (which has four sticky ends) make it possible to build a DNA computer with only three states. Although tumours are commonly caused by many more genes (often > 5), this is a sufficient size for investigation of the five genes of small-cell lung cancer (Benenson et al., 2004). Our biomolecular computer with numerous restriction enzymes could be valuable in this scenario for analysing malignancies caused by multiple genes.
The findings of the laboratory implementation of a finite automata with multiple endonucleases were presented to demonstrate the practicality of our theoretical model in the wet lab (BbvI, AcuI, BaeI and MboII). These experiments focused on the autonomous and alternating action of numerous (four) endonucleases in one test tube, which is critical to automata action. BaeI, one of the endonucleases, cuts double-stranded DNA molecules both ways (to the left and right). Our findings show a new technique to use endonucleases that cut DNA molecules in both directions, enabling for more powerful computational devices like pushdown automata to be implemented.