Friday 15 September 2023

THE DUCK IN MY BATH

Dick Pountain /Idealog 344/ 05 Mar 2023 02:51

I was born and schooled among the coalfields of North East Derbyshire, but I no longer have much of a regional accent. I came to London as a student and have been here ever since, three-quarters of my life. I haven’t acquired a Norf Landan accent, but you could detect my vestigial Derbyshire one were I to say, for example, “there’s a duck in my bath”. I’ve written here before about my fascination with human speech, especially using computers to recognise and simulate it, but my interest runs deeper than that. 

As a writer, both spelling and pronunciation matter to me: pronunciation matters not because I do a lot of public speaking, which I don’t, but because I read every line back ‘aloud in my head’ to see whether it works or not. Computers have certainly made determining pronunciation, particularly of words in ‘foreign’ languages, a lot easier, but it’s still not as easy as it could and should be. Enlightened sources like Wikipedia and the Oxford Dictionary do exploit the capacity of a computer to speak to you, but it’s not yet universally and transparently implemented at operating system level. 

Probably the route most people take to discover the pronunciation of a word is to Google it, which almost inevitably leads to one of thousands of (not always reliable) YouTube videos in which the word is spoken. I still occasionally have to resort to this, the upside being that doing so occasionally stumbles into interesting videos about accent and pronunciation, like an excellent series by Dr Geoff Lindsey. His video about ‘weak forms’  (https://youtu.be/EaXYas58_kc) explains a great stumbling block for those new to English speaking, that certain words get skipped over almost inaudibly by native speakers. 

This interest in foreign spellings and pronunciation often pays off. While reading an article about the 17th-century Polish–Lithuanian Commonwealth (not so obscure as it sounds given current events in that region) the spelling of Polish place-names had me continually scuttling back and forth to Wikipedia, and pronunciation mattered too because some Polish letter forms (like ł) resemble ours but are pronounced quite differently. Wikipedia helped out by offering an IPA (International Phonetic Alphabet) transcription – for example Wrocław becomes vrɔt͡swaf – and clicking that let me hear it spoken, even though it did awkwardly happen in a separate pop-up window. 

Google Docs, in which I’m typing this column, can’t speak to me, but if I type Wrocław into Google Keep, select it and hit ‘Translate’ on the right-button menu, it gets sent to Google Translate where I can hear it spoken. This works too in Facebook, YouTube, and any other app that has ‘Translate’ on its menu. Google Translate can speak many, though not all, of its supported languages, but it doesn’t at present let you change voices, pitch or speed. Even so, if you’re handy with your thumbs and use a dictation facility you can make your mobile act as a Startrek-style translator to converse with someone in another language (rather haltingly).  

The more capable text-to-speech apps like Vocality and Text Reader allow you to change voice (male, female, US, UK and so on), pitch and speed, but reading in regional accents, something I’d like to do, is currently beyond any of them. The first speech synthesiser I ever used in the early 1990s performed hardware modelling of the human vocal tract, and came with a very simple scripting scheme that let you markup text with ASCII tags to change the length of vowels or raise their pitch, but it never caught on. To do accents properly you’ll need to learn IPA, translate and edit your chosen words manually, then use an app that can pronounce IPA directly (which Google Translate can’t). IPA fonts are easily available, as are online services like ToPhonetics (https://tophonetics.com) that turn ASCII text into IPA and IPA Reader (https://ipa-reader.xyz) which can speak the resulting IPA. So, in principle I can alter texts to be spoken in regional accents, but it’s still a rather messy procedure split between several different apps.

That I might want to do it at all is in order to study isoglosses, boundaries between regions where different accents are spoken, an interest I share with one Ian McMillan from whose article (https://www.theguardian.com/commentisfree/2010/mar/21/language-derbyshire-barnsley-pronunciation-dialect) I learn that I was born right on the isogloss between South Yorkshire and North Derbyshire: “In Barnsley I call my house my house, but if I went to visit my cousins Ronald and Harry in north Derbyshire, they would meet me at the gate and invite me into their freshly wallpapered arse.” Whether or not there’s a duck in their bath seems rather irrelevant.

[Dick Pountain definitely lives in a ‘house’ in Norf Landan]


WOW, THAT’S SURREAL!

Dick Pountain /Idealog 343/ 06 Feb 2023 10:27


I freely confess that playing with the ‘Generative AI’ image service Stable Diffusion over the last few weeks has been enormous fun. And why wouldn’t it, since I’ve been using my personal computers to create images for the last 30+ years and I’m a sucker for surrealism. You should now be sensing a ‘but’ coming. But perhaps I only enjoy this experience so much because I’m an amateur and dilettante. I don’t depend on selling images for my living, and despite various feeble attempts have sold very few – I make them purely for pleasure and publish them for free on social media and my own website. The fact that Generative AI apps confer the ability to create professional-grade, photorealistic graphics – even to those who lack any drawing skills at all – is not a threat to my livelihood.  

I chose Stable Diffusion over more popular platforms like DALL-E mini, Midjourney, Deep Dream, WOMBO, Fotor and the rest (and there are lots of them) because it’s really free, doesn’t lure you into subscribing, and it’s very, very simple: you can only type in text descriptions and save such images as you like from its stream of results. That suits me just fine because I’m not intending to create a manga comic or an animated movie, and the restrictions are the whole point: I save the most outlandishly ‘wrong’ interpretations as instant surrealist pictures. I have briefly tried both Fotor and Midjourney, and came away traumatised. The latter is hosted on the schoolkid-oriented social network Discord, whose frantic user interface is the most baffling, frustrating and vaguely threatening I’ve seen since I dipped a toe into 4Chan back in 2010.

Were it only a matter of using Generative AI tools to forge Marvel-type comics or Picasso-type paintings (which they do rather well) then the people who need worry most are  illustrators and animators who risk being put out of work by greedy publishers. But actually the rest of us are equally at risk from this ability to alter the appearance of reality itself so simply. Ever since the Trump presidency we’ve become overfamiliar with the concept of ‘fake news’ and ‘deepfakes’; for many years students have been able to plagiarise documents for their essays, but now ChatGPT can even write them from scratch. Generative AI gives anyone the power to create events that never happened and objects that don’t exist with almost undetectable realism. Wearing my political commentator’s hat I try to keep abreast of what the Far Right – currently the source of the most venomous misinformation – is up to. I feel obliged to sample the bizarre conspiracies and pseudo-sciences they conjure up, from anti-vaxxing and ‘black goo’, through graphene oxide, ivermectin and hydroxychloroquine, to Bill Gates’s injectable nanobots. Google these at your peril. These propagators of nonsense aren’t just a problem for the USA either. For example Vanessa Barbara has described in a recent New York Review of Books the way the Far Right used YouTube videos during the Brazilian election that only narrowly unseated Bolsonaro (whose antivax policies killed 700,000, almost as many as Trump):“People who trust vaccines are called aceitacionistas (a neologism to describe people who accept things without questioning). Those of us who received Covid shots are ‘hybrids’ who have been ‘zombified.’ […] Despite exhaustive efforts from fact-checking agencies and the WHO, these groups continue spreading old falsehoods claiming that Covid vaccines contain microchips, nanoparticles, graphene oxide, quantum dots, and parasites activated by electromagnetic impulses. According to them, vaccines can carry HIV (the virus that causes AIDS), make coins stick to our arms, and give us the ability to connect to Wi-Fi networks or pair with Bluetooth devices.” You couldn’t make it up, but Midjourney could, and illustrate it with stunning visuals. 

The original Surrealist movement of the 1920s was an artistic response to the horrors of WWI, which employed unnerving and illogical imagery – both literary and visual – to satirise and oppose the conventional ideas that had lead to the war. It was a radical, even revolutionary movement, leaning toward anarchism and communism, which depicted the darkest aspects of human nature using an equally dark humour. A century later that dark humour thoroughly permeates our current popular culture, entertainment industry and even advertising. Generative AI could make such post-traumatic nihilism available as a visual weapon for everybody who desperately wishes to defy reality, which means quite a lot of people since reality is looking increasingly grim.

Am I suggesting AI imaging be restricted, licensed, even banned? Not at all, it can’t be done. Just as with handguns in the USA, this genie is well out of its bottle. (Memo to self: “genie with jewelled turban on copper lamp, octane render, photorealistic, style of brueghel”) 

[Sample Dick Pountain’s creepy concoctions at http://www.dickpountain.co.uk/home/pictures ]  

A FEELING FOR TRIANGLES

Dick Pountain /Idealog 342/ 05 Jan 2023 10:55

My column last month was a semi-temperate rant, triggered by the sensationalist reporting of advances in cosmology and particle physics by the mainstream press (the ‘wormhole in a quantum computer’ effect). I attributed this to the coming of age of successive generations reared on Star Wars and the Marvel Universe, which induces deep longings to flout the laws of physics.

Readers of a philosophical bent (assuming I have any) might have concluded from this column that I’m a red-faced, harrumphing old British Empiricist who lumbers around the world kicking things and shouting “I refute it thus!” but nothing could be further from the truth. By ‘further’ I mean that my philosophical views are 180° opposed to such an empiricism, if you believed that truth is organised as a 2-dimensional graph which I don’t. Instead I believe that all sentient living creatures, human beings included, are ruled by emotions and live in a world that’s constructed almost entirely by imagination.

There is of course a catch, and that is that those of us who think this way define the words ‘emotion’ and ‘imagination’ in a way rather different from, and more rigorous than, their use in everyday speech. I’ve suggested in this column several times before that emotions properly understood are evolutionarily ancient neural subsystems that alert us to dangers, attract us to food and to potential mates, persuade us to play or to freak out. They operate below consciousness but the chemical changes they produce in our bodies get detected by our senses as what we should more properly call ‘feelings’. 

And speaking of our senses, another confusion arises there. All living creatures must separate what is themself from what is the outside world by a barrier, in our case the skin. Of our five sensory subsystems, only taste and smell permit molecules from the outside world to cross this barrier: touch measures pressure and temperature on the skin while sight and hearing detect waves of visible radiation and air pressure. Having our skin penetrated by anything more solid usually constitutes an undesirable emergency, so our experience of the outside world is mostly via second-hand, internal, signals.

Our brain processes such signals to detect patterns in them, analysing those into smaller sub-patterns and storing them for future reference: whenever a new stimulus arrives the brain tries to recognise it by rebuilding an image from such stored components. We maintain an internal mental map of the outside world which is neither complete nor entirely accurate, constantly updated via Bayesian composition of new inputs, and which most importantly is coloured by emotional tags: we like or dislike the things and places it contains. The world we actually live in is an imaginary one, precarious and error prone but which evolution has honed to be good enough to keep us alive. People born with a desire to pet Bengal Tigers tended to have fewer offspring than those who preferred to run away (or invent the rifle). 

Part of this process should remind you a little bit of the multi-layer ‘neural’ nets used in the most successful AI deep-learning systems, namely the part about analysing and storing sub-patterns, but that’s as far as the resemblance goes. Computers don’t have bodies, nor emotions to protect those bodies, and most (not all) AI researchers remain staunch Cartesian rationalists who believe “I think, therefore I am”, when “I am, therefore I think occasionally” is closer to how humans really are. 

All those stored sub-patterns occupy a universe of the imaginary, not material things but possible ways material things could be arranged, unchangeable and infinite in number. A triangle is a triangle is a triangle and you can imagine or discover any shape, size and number of them. Seeing a real brown cow uses most of the same patterns as imagining a blue cow. We think by manipulating and recombining patterns, we speak and write by recognising and producing them, mathematicians study the rules they obey. Claude Shannon’s Information Theory, the foundation of our industry, is about transmitting them from one place to another. 

The point is such patterns have no power of their own to affect material things in the real world: that they can only do via our body and its muscles, by making them want to do something. Ever since Homo sapiens developed language it’s been inevitable that we would start confusing such patterns with things and wishing that they could do stuff directly. We developed science in order to understand why that doesn’t work, but not everyone wants to be so disabused, and some people make a living by exploiting and furthering such confusion. I’m perfectly happy to imagine blue cows, perhaps to write stories or make paintings of them, but I won’t try milking them…

[Dick Pountain likes Dr Johnson really]

 

SOCIAL UNEASE

Dick Pountain /Idealog 350/ 07 Sep 2023 10:58 Ten years ago this column might have listed a handful of online apps that assist my everyday...