Tuesday, 3 July 2012

PLAYING AT PARALLEL

Dick Pountain/08 December 2009 12:05/Idealog 185

Back in issue 180 I wrote about the renaissance of parallel processing, and what new problems it would soon start to present to programmers. As I write now, two announcements from Intel really rub in the point. First Intel announced a new experimental CPU called the Single Chip Cloud Computer (SCC) which contains 48 Intel Architecture cores on the same die, all capable of talking to each other via a software-configurable message passing scheme in on-chip memory. Intel had already demoed an 80-core chip called Polaris for its Terascale Computing program, but those were only floating-point units while the SCC's are full Pentium-compatible x86 cores (older Pentium, without all the sophisticated out-of-order execution of today's chips). But that same week Intel announced it's to dump its highly-parallel graphics processor chip Larrabee as a commercial product. A version of Larrabee with 32 x86 cores was demoed last year, but when axing it Intel admitted that both hardware and software are behind schedule and uncompetitive in performance.

It's a pretty safe bet that Larrabee's disappointing performance was down to the inherent difficulty of programming general-purpose, message-passing parallel computers - as described in my earlier column - compared to dedicated, pipelined vector GPUs from Nvidia and AMD. That earlier column jogged my memory so I went back over some of my Byte columns from the late 1980s related to parallel processing, and stumbled across an old software project of mine called PriPar Machines, standing for Pri)mitive Par)arallel computers, a simulator written in Turbo Pascal for a system of communicating Turing Machines.

A Turing Machine is the simplest possible kind of computer, originally invented as a thought-experiment by the father of computer science Alan Turing. It consists of a print head that moves back and forth reading, writing or erasing marks along an infinitely long paper tape under the control of a stored program. Turing showed that given unlimited tape and time such a simple machine can compute anything a more complex architecture can achieve.

PriPar approximates a class of Turing Machines with two extra instructions to send and receive messages from one another (and obviously their tapes aren't infinite because they're represented in finite RAM). It's object-oriented so you create any number of PriPar machine objects, connect them together by message channels, give each one a stored program and some data on a tape, then set them all off running and messaging one another. It's a sort of Meccano or Lego kit for playing with the problems of parallel processing. I didn't give it a flashy graphical interface: you just view the output as a scrolling display of tape contents, strings of dashes like //////////////.

A PriPar machine has just two extra instructions, cribbed from the Occam language, "!" which means "send a message and proceed", and "?" which means "wait until you receive a message before proceeding". (Since each PriPar machine has just a single channel connected to one other machine, there's no ambiguity about who it's to). However this is a software simulation, so you can easily add extra channels, buffer messages, or change the meaning of instructions to see the effect on efficiency, integrity and programmability, and I did quite a lot of experiments of that sort.

That original 1990 Byte article attracted little interest: few people had a clue what I was on about since parallel processing was specialised stuff and Turing Machines on a par with mermaids and unicorns. But a lot of work went into it, and it's topical again, so I resolved to recycle it. No-one uses Turbo Pascal 5 any more, and it was pretty unsuitable even back then (I represented tapes as array constants of specified length and padded them with spaces) so I rewote it in Ruby, which to my surprise and delight took just a single Sunday afternoon. It's far, far more flexible because Ruby arrays are fully dynamic, like Lisp lists, so you can just add stuff without worrying about size. I've put the Ruby source for my PriPar system and some sample programs on my website www.dickpountain.co.uk, under Ruby Projects.

Programming a Turing Machine is pretty brain-mashing since you're working with strings of dashes not numbers and even the simplest arithmetic involves much thrashing about. PriPar machines are somewhat easier to program because you can delegate repeated operations to a second machine, and I've written a small library of higher-level routines in Turing Machine code, operations like Inject, Multiply and Tally which you combine to create parallel programs. Hopefully you'll be able to figure out the syntax from my example programs. In theory PriPar processes are self-synchronising but only up to a point, and for large programs the result may depend on the order of completion which is no good. I therefore added Barrier Synchronisation: subdivide longer programs and force all processes to synchronise at each barrier between sections, then shift the barriers around until you get the right answer!

I've made a lot of progress in PriPar programming, my best shot so far being the sum of terms of an arithmetical series, but a parallel factorial function eludes me (I cheat by pre-placing successive integers onto each individual tape). Perhaps you can do better?

[Dick Pountain is not afraid of being called a nerd but doesn't like being called a geek (something to do with Todd Browning and chickens' heads...]

No comments:

Post a Comment

CHINA SYNDROME

Dick Pountain /Idealog 357/ 08 Apr 2024 01:09 Unless you live permanently as an avatar in Second Life [does that even still exist?] then it ...