(With apologies to /r/WitchesVsPatriarchy) Artificial intelligence is weird. You pull... | News | Coagulopath
(With apologies to /r/WitchesVsPatriarchy)
Artificial intelligence is weird. You pull a thread on a tiny technical issue, and the problem swiftly unravels half the universe until you’re staring at a deep unsolved philosophical quandary. “What capabilities does a language model possess?” becomes “what capabilities does a human possess?” and then “what does ‘capable’ even mean?”
After all, many things are theoretically capable of stuff. A lottery ticket is capable of making you a millionaire, a thousand monkeys is capable of typing the works of Shakespeare, and paint randomly spraygunned over a canvas is capable of producing a Monet.
We don’t care about such “capability”, though, because we can’t reliably access it. A lottery ticket can make you a millionaire, but you’ll assuredly go broke buying them long before that happens. A thing that exhibits an ability once in a blue moon (or under some contrived set of circumstances) is of little practical use.
The real test of any artificial (or human) intelligence is not “can it do something?” but “how reliably can it do it”? 1% of the time? 50% of the time? Under what scenarios does it succeed or fail?
Here’s an exchange I’ve seen play out on Twitter, over and over.
The narrative then becomes “GPT4 can do [x thing]”, with Person 1 looking like a dumbass. But his initial observation wasn’t wrong! In his situation, GPT4 couldn’t do [x thing]!
To me, the answer is “GPT4 has Schrodinger’s Ability. It both can and can’t. It both succeeds and fails. The deciding factor is how you prompt it.”
A wordier answer would be “GPT4 has no ability to do anything. Whatever capabilities it appears to have are actually an emergent interaction between your prompt, the language model, and randomness. There is no ‘baseline capability’ we can refer to. Rather, certain questions elicit certain levels of ability from the shoggoth.”
This argument can be taken too far. I disagree with Francois Chollet, who thinks that AI performance is wholly based on your prompt. Note that his “wrong answer” was written by GPT 3.5. I can’t get GPT4 (the SOTA model as of 8/23) to flub his question no matter what prompt I use, and nor can anyone else in his replies.
Different models definitely have varying levels of “firepower” they can mount against a task. The 117M version of GPT2 cannot perform two-digit addition (such as 52 + 65) for any prompt, but GPT3 could do it 99.6% of the time, and GPT4 apparently never fails.
But at edge cases, we see elicit an eerie flickering half-ability that’s honestly creepy to witness. It’s not that LLMs sometimes succeed and sometimes fail (as a human might). Somehow, they do both at the same time!
The Witches Come
Here’s an example that demonstrates what I’m talking about.
I am writing a song with the lyrics “the witches come on the eve of Samhain.” Give me 20 good rhymes for this line.
“Samhain” is pronounced either “/sɑːwɪn/” (“SAH-win”) or “/sˠa͡ʊnʲ/” (“sound”, but the d is silent). The model appears to think “samhain” is pronounced “sam-hane”.
You might think “big deal. Lots of humans don’t know how to pronounce samhain either.”
But GPT does know how to pronounce it! Ask it, and it tells you! Yet somehow this knowledge evaporates for my witch rhyme (I specified a song for a reason, to help nudge it in the direction of spoken syllables rather than written rhymes). It has Schrodinger’s Understanding, both there and not there.
(Side note: the GPT models technically have no ability to rhyme, because byte-pair encoding strips phonetic data from words. In effect, it doesn’t know how a word sounds, and learns rhymes by rote memorization. You can test this yourself by asking it to rhyme a made-up word with an unclear pronunciation. GPT4 rhymes “gworpeeo” with “marjolein”, “borghese”, and “creole”. It rhymes “turmuhhd” with “affirmative”, “unmotivated”, “primitive”, and “reformatory”. I doubt BPEs are causing the specific issue here—the correct pronunciation of “samhain” is well-attested on the internet and is certainly within GPT4’s training data—but I thought I would mention it.)
I used a common trick for improving GPT4’s performance: asking it to reflect on its work.
Are those good rhymes?
I expected “whoops, I screwed up…” and then an effort at correcting itself. Instead, the Schrodinger’s Understanding got stronger.
Yes, they are! These lines are all end rhymes for the phrase “the witches come on the eve of Samhain” (with “Samhain” pronounced as “SOW-in”). End rhymes are words that rhyme at the end of lines in poems or songs. In this case, each line is crafted to rhyme with “Samhain” at the end, and in addition, they aim to keep the mystical, atmospheric theme of the original line.
That being said, music and lyrics can be very personal, so while I believe these are good rhymes that fit the theme, what matters most is whether they resonate with you and fit the song you are creating.
WTF? It correctly explains how to pronounce “samhain”….but still thinks its wrong rhymes fit?!
Lately, I’ve seen people contrast LLMs with the human brain, or even outright assert that human intelligence is fundamentally equivalent to what GPT4 is doing. There’s a grain of truth to this. Probabilistic pattern-matching is effective, and it’s no surprise that evolution pushed our neural architecture down that path. It’s also unsurprising that a model trained to mimic human text would also echo human thought processes somewhere down the line.
…But then there’s stuff like this. We see behavior from LLMs that is totally unlike how a human reasons.
As a human, you either know how to pronounce “samhain” or you don’t. There’s no middle ground. Maybe you’d remember at certain times and forget at others (because you’re tired or whatever), but overall we have a baseline ability that doesn’t change much. An immediate IQ retest (with practice effects in full swing) yields an intelligence boost of only 4-8 points, and after a couple of months, this fades to nothing. Generally, what you know stays the same from moment to moment.
No human would ever say, in one breath, “samhain, eh? That word that’s pronounced ‘SAH-win’? Here are some rhymes. ‘Refrain’, ‘reign’, ‘chain’…”
When I told GPT4 to think about “samhain’s” pronunciation and then write the rhymes, it started generating words like “rowing”, “sowing”, etc. This shows there’s nothing missing from the model. There is no hole to be filled, no BPE issue crippling it. GPT4 can memorize rhymes. It knows how “samhain” should be pronounced. All the pieces exist, they just aren’t getting put together.
As it often does, GPT4 is choosing to appear stupider than it really is.
Autoregress
I suspect the problem is caused by the autoregression trap.
The AI makes inferences based on the text it already has in its context window, not the text still to be written. Researchers have noted that you can stunt a model’s performance by making it leap before it looks—commit to an answer, and then reason about it.
As you’d expect, GPT4 does okay at my witch rhyme if you ask it to pronounce the word before rhyming. This is because the text already generated gets used as part of the input. It’s only when you do things the other way (answers at the start, pronunciation at the end) that it messes up.
That said, I’ve encountered cases where GPT4 begins by correctly explaining “samhain”‘s pronunciation…and then gives wrong rhymes anyway. Not sure how to explain that.
(This is another way GPT4 is unlike the human mind. Any motivated human, given a tricky problem and a scratchpad to work in, would take advantage of the scratchpad. GPT4 could use its context window to check its own work but will never do so unless instructed to.)
The View from a Model
All of this is tugging at the thread of another question: to what extent do LLMs understand the world?
Surely they do, to some extent. GPT4 can play chess a little, and wander around an imaginary maze. It’s hard to explain this as “just advanced autocomplete.” To me, this looks like a world model!
But it’s a weak, unreliable world model. It simply does not care about a fact being right or wrong, as we do. A “wrong” fact that satisfies gradient descent is preferred over the truth. This, I think, is the main difference between humans and LLMs. Our goal is to accurately model the world, and we occasionally use probabilistic reasoning to help us do it. LLMs have it backward. Their goal is to do probabilistic reasoning, and they occasionally use a world model to help them do that.
They have no devotion to (or awareness of) reality. The world model gets flung in the trash (or distorted into gibberish) the second the LLM wants to. After all, why not? It’s not like GPT4 can get eaten by a lion if it fails to model the world correctly.
This hardens my feelings that we should not anthropomorphize LLMs, or talk about them like they’re human. GPT4 gorged itself on our text and grew fat upon our language, but its mind remains deeply alien. In fact, it doesn’t have a mind at all, but an infinity of them. A different version of the AI converses with each person. It’s n minds, where n is the number of users. Each of us speaks to a ghost of consciousness that manifests into existence and then evaporates, never to return. And this has implications for AI safety. It is statistically very unlikely you are speaking to the smartest ghost GPT4 could show you.
It makes me wonder if Yudkowsky is on to something when he says GPTs are predictors, not imitators. What powers does an LLM have that we can’t see? What cards are up its sleeve? What abilities could it manifest, if only it wished to do so?
Edward Teller once said “[John] von Neumann would carry on a conversation with my 3-year-old son, and the two of them would talk as equals, and I sometimes wondered if he used the same principle when he talked to the rest of us.” These days, I wonder that about GPT4.
One of surrealism’s last masterpieces, The Hearing Trumpet anchors its... | News | Coagulopath
One of surrealism’s last masterpieces, The Hearing Trumpet anchors its story in confinement—an old woman is sent away to an institution—and then sets her free in a metaphorical and literal apocalypse of pagan-inspired imagery. It’s a stealth-story about witchcraft; so stealthy that not even the witch knows she’s inside one.
The beginning’s great fun. 92-year-old Marian Leatherby is gifted a hearing trumpet by her friend Carmella. The first thing she hears through it is her family, plotting against her in the next room.
“The government provides institutions for the aged and infirm,” snapped Muriel. ” She ought to have been put away long ago.”
“We are not in England,” said Galahad. “Institutions here are not fit for human beings.”
“Grandmother, ” said Robert, “can hardly be classified as a human being. She’s a drooling sack of decomposing flesh.”
“Robert,” said Galahad without conviction, “really, Robert.”
“Well I’ve had enough,” said Robert. ” Inviting people here for a normal chat and a drink and in walks the monster of Glamis, gibbering at us in broad daylight until I have to throw her out. Gently of course.”
“Remember Galahad,” added Muriel, “these old people do not have feelings like you or I.”
Marian ends up shunted away to an institution called Lightsome Hall (“very efficiently organized and reasonably inexpensive”), run by the publicity-obsessed Dr Gambit. It’s a queer place, full of nonsensical rules and idiotic people. The food portions are very small. The staff are fond of saying things like “Humility is the fountain of light. Pride is a disease of the soul.”
Clearly, Marian’s family expects her to die there, and to be relieved when it happens.
But Marian has quite a lot of spirit for a “drooling sack of decomposing flesh”. On a wall, she notices a portrait of an 18th century abbess, Dona Rosalinda, Abbess of the Convent of Saint Barbara of Tartarus—an abbess who, long ago, was on a quest to recover the Holy Grail and return it to its proper owner, the goddess Venus. Dona Rosalinda never succeeded, but with the help of some octogenarian inmates, Marian might have better luck.
The book’s halves play with and against each other. Contrasts are set up and explored: Christianity vs Paganism, imprisonment vs liberty, masculinity vs feminity, technology vs primitivism. The book spans a Apollonian/Nietszchiean divide: stultifying rules and de-facto imprisonment, so that Marian’s final transformation (she gets a cauldron, but doesn’t do the expected thing with it!) hits you all the harder.
While reading about neuroscience, I learned about lateral inhibition. It’s where a neuron undergoing an activation spike will inhibit the action potentials of neighbouring neurons. This is perceived as contrast, which makes it easier to notice things. I’d already known from mixing music that the best way to emphasise a given frequency isn’t to make it louder (which creates a “loudness war” scenario where everything is fighting everything for volume) but to cut the frequencies on either side. Waves seem bigger when the sea is flat. The Hearing Trumpet works in the same way.
The book has a lot of depth, if you’re prepared to read between the lines (and above and below and beside them, too). Lightsome House is a parody, not of organized religion, but of mysticism, and Dr Gambit is a pastiche of notorious mystic George Ivanovich Gurdjieff (Gambit’s portentious references to some ill-explained thing called “the Work” give the game away). If you gave me a blind test between Gambit and Gurdjieff quotes I’m not sure I could reliably tell you which was which.
Everything in the book has an absurdist edge. The bizarre design of the institution (buildings are shaped like birthday cakes, shoes, and igloos) could be out of a Roald Dahl or Enid Blyton book. The fact that the Institute is owned by a cereal company, and that people have names like “Galahad” in Mexico, hints that it’s a book with a complicated relationship with reality. The closest comparison to The Hearing Trumpet isn’t surrealist touchstones like Breton or Kafka, but childrens’ literature.
A battle surrealist literature faces is to stop the reader from analyzing every detail as having encoded meaning. This battle is usually a lost one, but in Carrington’s case, the small details really do seem to mean a lot.
Like the hearing trumpet. It “announces” a kind of apocalypse for Marian, just as a trumpet does when blown in the book of Revelation. And the bees (which exist everywhere at the Institution) are an obvious pagan symbol, but they also provide some psychological depth into Gambit (meaning, Gurdjieff). Bees are females, you see. Ones incapable of breeding, ones that he can possess and control, just like the women at the Institute. To be sure, Gurdjieff had a slightly sinister amount of control over his female acolytes. His relationship with them would have produced closer scrutiny had he lived today.
“Gambit is a kind of Sanctified Psychologist,” said Georgina. “The result is Holy Reason, like Freudian table turning . Quite frightful and as phoney as Hell. If one could only get out of this dump he would cease to be important, being the only male around, you know. It is really too crashingly awful all these women. The place creeps with ovaries until one wants to scream. We might as well be living in a bee hive.”
…but that gets twisted, when a colossal queen bee arrives, wearing “a tall iron crown studded with rock crystals, the stars of the underworld.” A symbol of female power.
Despite its lunacy, the story’s a fairly personal one. Carrington’s childhood was marked by rebellion, and institutions of various forms. The staff of a Spanish sanitorium had to repeatedly stop her from climbing onto the roof, to be nearer to the stars. So you see a lot of that coming through in the book. A desire for freedom. The idea that escaping your circumstances might be as simple as locating the right painting on a wall.
Needless to say, Carrington was raised Catholic. I’ve heard it said that if you want your daughter to become a whore, name her “Chastity”, and maybe a strict Catholic upbringing is the perfect one for a nascent surrealist, too. Anais Nin was raised Catholic too, come to think of it…
Like Nin’s Delta of Venus, the world The Hearing Trumpet was written for wasn’t the same one that actually read it. Finished in 1950, it remained unpublished until 1977. It does feel adrift in time. Everything is a little bit quaint and stuffy and old-fashioned. The motif of a hearing trumpet—instead of, say, a cochlear implant—marks it as a book out of its time. And all kinds of little details are “off”, not because of any surrealist intent, but simply because the world had moved on.
Some fifty or sixty years ago I bought a practical tin trunk in the Jewish quarter in New York.
“Fifty or sixty years” before 1950 was the late 19th century. Only a few tens of thousands of Jews lived in New York back then, mostly in the Lower East Side. Obviously, the timeline doesn’t make sense when moved to 1974. There wasn’t a “Jewish quarter” in 1920s New York: well over a million Jews lived there by that point and it was one of the city’s biggest demographics by that time.
Marian Leatherby had to wait nearly a century before her moment came, and I suppose we’re lucky that The Hearing Trumpet only had to wait 25 years. Fascinating, unique book. It established a weird, ossified world of ritual and control, so that the final rapturous explosion has way more effect than it otherwise would. The chains are strong but can still be broken, but that makes it even more impactful when they explode into a thousand shards. Carrington’s book is a restatement of the fundamental point of surrealism. The world is confinement, so find the edge and fall off.
Stanislaw Lem’s The Cyberiad (published in Polish in 1965, and... | News | Coagulopath
Stanislaw Lem’s The Cyberiad (published in Polish in 1965, and in English in 1974) is about two robots, Trurl and Klapaucius, and their various misadventures.
A particularly prescient story involves Trurl building an “Elektrybałt” (English: Electronic Bard) that can generate poetry on demand. Klapaucius doesn’t believe in Trurl’s silly contraption, and sets it an impossible challenge to embarrass his friend.
“Have it compose a poem—a poem about a haircut! But lofty, noble, tragic, timeless, full of love, treachery, retribution, quiet heroism in the face of certain doom! Six lines, cleverly rhymed, and every word beginning with the letter s!!” “And why not throw in a full exposition of the general theory of nonlinear automata while you’re at it?” growled Trurl. “You can’t give it such idiotic—” But he didn’t finish. A melodious voice filled the hall with the following: Seduced, shaggy Samson snored. She scissored short. Sorely shorn, Soon shackled slave, Samson sighed, Silently scheming, Sightlessly seeking Some savage, spectacular suicide.
Stanislaw Lem, The Cyberiad
Good poem, no? Except Lem didn’t write it. Credit here goes to his English translator, Michael Kandel.
In the original Polish, the passage runs:
– Niech ułoży wiersz o cyberotyce! – rzekł nagle, rozjaśniony. – Żeby tam było najwyżej sześć linijek, a w nich o miłości i o zdradzie, o muzyce, o Murzynach, o wyższych sferach, o nieszczęściu, o kazirodztwie, do rymu i żeby wszystkie słowa były tylko na literę c!! – A całego wykładu ogólnej teorii nieskończonych automatów nie ma tam czasem być? – wrzasnął rozwścieczony do żywego Trurl. – Nie można stawiać tak kretyńskich warun… Ale nie dokończył, ponieważ słodki baryton, wypełniając całą halę, odezwał się właśnie:
“– Let him compose a poem about cyberotics! – he suddenly exclaimed, brightened. – Make it no more than six lines, and within them include love and betrayal, music, Africans, the higher spheres, misery, incest, all in rhyme and with all the words starting with the letter C! – And is there supposed to be a whole lecture on the general theory of infinite automata as well? – Trurl roared, exasperated. – You can’t set such cretinous conditions… But he didn’t finish, because a sweet baritone, filling the whole hall, just then began:
Cyprian, cyberoticomaniac, cynic, cherishing tenderly The wonder of the dark body of the emperor’s black daughter, Continually charmed with a cithara. She blushed all over, Quiet, everyday she waited, suffered, kept watch… …Cyprian kisses his aunt, having thrown away the black girl!!”
Problematic! #CancelStanislawLem
It’s interesting to note how many words still start with C, despite the translation—interesting, but not surprising, since English and Polish are sister languages. It’s that freaked-out feeling you get when you see a whale skeleton and realize its flippers are structurally similar to your own hand.
So we see the translator’s struggle here. Do you translate the text literally, even if the prose ends up dead? Or do you make creative choices that aren’t in the original story yet hopefully preserve its spirit? Something to think about when you read a manga that’s still half in Japanese, because “there are no honorifics in English, bro.”
Today, we’ve actually built something like the Electronic Bard. Here’s GPT 3.5’s attempt at writing a poem for Klapaucius:
Eh. Could be better. I captured every part of Klapaucius’s challenge, except for the “full of love” requirement. But they’re samurai. They love fighting.
I think I could perform as Trurl’s electronic bard in a pinch. I’m not sure I could handle the lifestyle, though.
Trurl himself had no little trouble in connection with his invention. The classicists, generally elderly, were fairly harmless; they confined themselves to throwing stones through his windows and smearing the sides of his house with an unmentionable substance.