Wednesday 8 May 2013

GRAND COMPLICATION

Dick Pountain/PC Pro/Idealog 220 08/11/2012

Last night I watched the latest addition to the dismal genre of populist TV science programming (from which I exclude Horizon as it does attempt to get serious). A not-all-that funny Irish comedian/gameshow host and a young journalist with a striking Bollywood coiffure were baiting one of Britain's most distinguished scientists, Sir John Sulston, because the human genome project has not yet delivered cures for cancer and the common cold, despite spending so much of "our taxpayers' money". Sir John grinned weakly and bore it, even admitting that scientists sometimes play on the ignorance of politicians to get funding, but his main rebuttal was that the way genes work is far, far more complicated than either the public or even geneticists yet understand. This "backlash" thinking - reminiscent of the way pop stars get built up and knocked down again - arises because we perhaps imagined the genome as a recipe book, where all you have to do is read off a recipe and cook it.

Our bodies contain several separate but cooperating information processing systems - the nervous (including the brain), immune, muscular, metabolic and skeletal systems, plus the DNA itself - which form a complex heterogeneous network, talking to each other via nerves, hormones and other chemical signals. Recently the ENCODE (ENCyclopedia Of Dna Elements) project has spotlighted just *how* complex: the sequence of base pairs in DNA encodes only a small fraction of the information required to run our bodies, and the huge stretches of what used to be called "junk DNA" are actually switches that modify the "run-time" course of the computations. We are mostly built of proteins and protein-based enzymes control almost all of our cellular chemistry. Genes are templates from which these proteins get fabricated, and though every cell in your body contains a full copy of your genome, most of its genes are turned off: otherwise every cell would be churning out every possible protein all of the time and you'd be a large (and very short-lived) sticky blob. Selectively turning genes on and off controls the activities of individual cells, which in turn determines how our bodies grow, survive and act upon the world.

The decade-long ENCODE project (funded by the National Human Genome Research Institute) identified the regions of the human genome where such controls operate, and in September 2012 published 30 seminal papers that assign functions to 80% of the genome. As this is a PC mag not a biochemistry journal I won't dwell on their details, beyond saying that control gets exerted mostly through big proteins called histones sticking to DNA sequences to mask them from being expressed, or by methyl groups being added as stoppers to certain bases. (The journal Nature has a brilliant interactive widget at http://www.nature.com/encode/#/threads/ if you do want to know more). The result is that genetics just became orders of magnitude more complex, which makes our impatience with the rate of medical spin-offs tragically misplaced.

We might once have pictured the genome as a computer program, which somehow "executed" its genes to build our bodies and make us do stuff. Now we know better: it's more like a database of blueprints for computer components, rather than program instructions. So where are the executable instructions? Well, they're proteins operating within particular cellular environments. But those proteins are still made by DNA? Yes they are, and where and when they get made depends on all those gene switches that ENCODE describes. Some of these instructions in turn control the way the DNA gets transcribed, so it's a dynamic, recursive, self-modifying program whose behaviour is generated on-the-fly rather than recorded in the DNA sequence (which is mostly static data, except occasionally when a mutation occurs or a virus inserts its own code).

Anyone who's written computer programs in anger will know that self-modifying programs are best avoided. Sure, when you're a cocky newbie it feels clever to write self-modifying code (for one thing it can be a tremendous memory saver) but you'll discover that it soon becomes impossible to debug or understand. Reading the source code of the program no longer tells you what it does, only executing it can do that. Microsoft once flirted with self-modifying code in early versions of Windows, for selecting different hardware options, but it now deprecates the practice and options are set by reading in external config files. So how does nature manage the dynamic, self-modifying computation system that is a living organism? The unsatisfying answer is same as always, through 3.5 billion years of evolution rather than studying algorithms. Snipping and inserting genes to cure some disease is not at all like editing program code - we've already seen one genetic medicine project halted because it gave the test subjects leukemia. When debugging a genetic program, execution can literally mean execution.

SOCIAL UNEASE

Dick Pountain /Idealog 350/ 07 Sep 2023 10:58 Ten years ago this column might have listed a handful of online apps that assist my everyday...