Dick Pountain /Idealog 370/ 05 May 2025 01:31
I’ve written many, many sceptical words about AI in this column over the years, railing against overconfidence and hype, hubristic pursuit of AGI, deepfakery and content pillage, but nevertheless I do believe AI – once we’ve civilised it – is going to be hugely important to science, economics, robotics, control systems, transport and everyday life itself. Given the political will, public concern about misinformation, invasion of privacy, and theft of artistic data can be regulated away, but there would remain one colossal stumbling block, namely energy consumption.
When AI corporations consider purchasing mothballed nuclear reactors to power their compute-servers the absurdity of AI’s current direction ought to be visible to everyone. The current generation of GPT-based AI systems depend on supercomputers that can execute quintillions of simple tensor arithmetic operations per second to compare and combine multiple layers of vast matrices holding encoded parameters. Currently all this grunt is supplied using the same CMOS semiconductor process technologies that gave us the personal computer, the smartphone and especially the computer game – the Nvidia chips that drive most AI servers are descendants of ones originally developed for rendering real-time 3D games. The latest state-of-the-art GPUs have a watts/cm² power density around the same as an electric cooking hob, and the power consumption of AI server farms scales exponentially, as the square of the number employed (order O(N²) in the jargon of complexity theory).
In their 1978 bible of the CMOS revolution ‘Introduction to VLSI Systems’, Mead and Conway devoted a final chapter to the thermodynamics of computation: we’ve long known that logic operations and memory accesses always consume energy, whether in silicon or in protein-and-salt-water like the human brain. However the human brain has far, far more neurons and synapses than even the largest current AI server farms have GPUs, yet consumes around 20 Watts as opposed to AI’s 50+ Megawatts. Understanding what’s responsible for this immense efficiency gap is crucial for creating a more sustainable next generation of AI, and the answer may lie in new architectures called ‘neuromorphic’ because they mimic biological neurons.
Individual metal-on-silicon-oxide transistors aren’t six orders of magnitude more power-hungry than biological neurons, so other factors must be responsible for the huge difference. One factor is that biological neurons are analog rather than digital, and another is that they act upon data in the same place that they store it. In contrast the CMOS GPUs in AI servers are examples of von Neumann architecture, with processing logic separated from memory, and program code from data. But the MOSFET transistors they’re made from are inherently analog, operated by varying voltages and currents, so the digital data they manipulate gets continuously converted back and forth between the domains, at great energy cost.
Neuromorphic AI hardware designers try to bring data and processing closer together. Intel introduced its Loihi 2 research chip, with 128 neuromorphic cores and 33MB of on-chip SRAM, which communicates via trains of asynchronous voltage ‘spikes’ like those in biological neurons. Steve Furber (of ARM fame) works at Manchester University on a neuromorphic system called Spinnaker that has tens of thousands of nodes each with 18 ARM cores and memory, also using spike-based communication. These schemes do reduce data access overhead, but they remain digital devices, and to approach biological levels of energy economy will require a still more radical step into purely analog computation that exploits the physics of the chip material itself.
The US firm Mythic’s AMP (Analog Matrix Processor) chip employs a 2D grid of tunable resistors whose values encode the weights for an AI model, whereupon it relies on Kirchoff’s Laws to in effect multiply-and-add the analog input voltages and perform convolutions. However AMP is still fabricated in CMOS. A more radical next step would be to implement this resistive analog computation using low-power ‘spintronic’ memristors – devices in which the orientation of magnetic spins represent bits as in modern hard disks. One way to implement non-volatile memristors is by FTJs (Ferroelectric Tunneling Junctions) formed by sandwiching nano-thin magnet/insulator/magnet layers which can be fabricated using existing semiconductor processing. These devices can be written to and switched cumulatively like real neurons, and read-out non-destructively using very little power.
The Dutch physicist Johan Mentink used a recent Royal Institution lecture (https://youtu.be/VTKcsNrqdqA?si=ZRdxeyP4B-hfUw3X) to announce neuromorphic computing experiments in Holland that employ two-dimensional cross-bar grids of memristors, organised into a network of ‘Stochastic Ising Machines’ that propagates waves of asynchronous random noise whose interference yields the spike trains that transmit information. The Dutch researchers claim such devices can potentially be scaled linearly with the number of synaptic connections, reducing power consumption by factors of 1000s. I love the idea of working with rather than against noise, which certainly feels like what our brains might be doing…
[Dick Pountain’s neurons are more spiky than most]
No comments:
Post a Comment