The Map Becomes The Territory (WMTP 1 of 3)
Worse Math Through Politics is an essay series about Large Language Models, cognitive dissonance, formal systems, and moral hazards.
If everything goes according to the plan, by the end of this series we should answer pressing questions such as:
How come that GPT-4 seems to be such paltry improvement over GPT-3?
How does training a model to engage in preference falsification - that is, to profess belief in something known to be inconsistent with the truth - affect its reasoning capacity down the line?
Can an LLM discover new facts about the world?
Is it possible to safely “align” language models through fine-tuning and reinforcement learning?
Hopefully, the proposed framework will help more people have a clearer idea of (and make more accurate predictions about) LLMs, the risks they present, and the opportunities they create.
Here is a tentative table of contents, with low-confidence ETAs for the next issues:
Part 1: The map becomes the territory: Training.
Some ground Truths
Some models are wrong, all models are useful
Claims and acknowledgementsPart 2 (4/28): Calvinism and the Monopoly of Ethics: Fine tuning
Part 3 (5/3): We'll sing our special song and this is it: RLHF
At the end of each newsletter, I will present some explicit claims. While I welcome trolling and nonconstructive criticism when it comes to style, structure, pointless references and thematic punning, if you disagree with any of the claims at the end I would appreciate it if you made your best argument against it. If I'm significantly wrong, I would really like to know, since this will guide some future decisions - including, but not limited to, changes in the plan for upcoming chapters and potential full rewrites.
So, please: be kind, and debate me like it’s 2009 LessWrong.
WMTP 1: The map becomes the territory
Where, after a brief detour on indexicality and reference, we explore an exceedingly common misunderstanding of the goals and uses of Large Language Models..
Metaphors, and senselesse and ambiguous words, are like ignes fatui; and reasoning upon them is wandering amongst innumerable absurdities.
— Thomas Hobbes
Analogy is the fuel and fire of cognition.
— Douglas Hofstadter (watch it!)
1.1. Some ground truths
There is a Physical World - populated, among others, by us.
If you're reading this, chances are you have a drive to learn about your surroundings and the means to do it. Early on, you likely learned about the world through direct sensory interactions and imitation, like the rest of your chordate comrades. Language enables you to extract, transmit, and store information about objects, their relationships, and their interactions. It allows for specificity and fidelity that penguins cannot match. It also allows cultures to emerge and develop, with different languages and, more importantly, different ways of organising the physical world into objects.
This process results in a taxonomy: a collection of objects encoding rules about where each one ends and another begins, and which characteristics are important for us to decide where to draw the line for each type of object.
Now that they're neatly separated and identifiable, you can stack them up to build the kind of abstractions that no lemur could grasp, no sheep could dream up, and no heifer could conceive. This allows you to compare the shapes of those abstractions, giving your knowledge the flexibility necessary for tackling any novel phenomenon you'll encounter. For example:
radio waves dissipate as they travel farther from their source;
radar bounces;
you click the link - the abstract concept in graphs and hypertexts, not his previous incarnation as part of a chain - and only rarely the trackpad.
Most basic abstractions are anchored to the physical world (“I’m feeling quite down today”). Others build up from these, and soon enough we’re perched on a surprisingly solid edifice and quite unable to make out the field where the foundations lay. Not only that: if you were to look down too often, squinting, you run the risk that the resulting vertigo will distract you from the task at hand, resulting in some fairly rickety analogies.
The good news is that there's no need to crawl back down too often. Once you're comfortable with the process and secure on the stability of the foundations, you're free to keep on building higher, barely realising that today's tools were yesterday's walls.
Base metaphors
Something as abstract as graph theory was born as close to the ground as can be: a particularly high-systematising flâneur in Königsberg needed to know once and for all whether there was a way through all of its city’s seven bridges that could spare him the maddening experience of walking over the same one twice.
In that case, represented below, the answer was a resounding no.
I’m sure you could derive a generalisation with ease - saving me the burden of looking for explainers such as this video - but, in the interest of time, I’ll enunciate here the basic principle:
No more than two landmasses (nodes) can have an odd number of bridges (vertices)
By examining a city map, you can determine whether it is possible to lead an AI Alignment meet-up group through each of its bridges without causing discomfort. Count the nodes with odd edges. If there are more than two, it's not possible.
Let's apply this to the equally pontiphiliac city of Kaliningrad.
One day, your friend calls you in a panic: "I can't explain how I got here, but can I cross every bridge once and only once? I left three cards on your desk." You eventually find the cards under some papers.
“The ones representing an unknown city in decreasing amount of detail?”
"Yes, those ones. You should be able to guess the city based on context cues. So, can I cross the bridges or not? It's a matter of life or death."
You’re a quick learner, so it doesn’t take you much time to answer in the negative. And yes, your friend is safe from whatever danger had befallen him, but what’s more interesting to us is the following: which of the maps did you use to get the answer?
I’m guessing the third, or perhaps the second: surely not the vivid, realistically detailed first one. Now, the third one is possibly the most separated from reality but still: it is the best one to reason about bridge-based problems.
Foundational myths
Mathematics makes collaboration easy because it is substrate-independent. When sharing new ideas about graphs, it doesn't matter whether one's base metaphor came from bridges and another's from hypertexts.
"A Graph" is no longer something we need to point to in the physical world. It works as a completely autonomous system, independent from its possible analogues. Its power lies in the properties it possesses, in its mapping onto other concepts, and in the set of purely mechanical algebraic operations that can be applied to it. It is both part of the building and offers new tools. You can use it on other concepts as you build your way towards the stars.
Unfortunately, not all tools allow for such daring structures without carrying a commensurable risk. Much of our current political animosity can be ascribed to groups failing to agree on what our words point to and on the correct level of abstraction for their use. The same happens in the interpersonal realm.
If you are joining forces with your neighbor and relying on them for support, you better make sure your blueprints represent the same objects. The further you get from the ground, the more crucial this requirement becomes. In a society as complex as ours, where our buildings start encroaching on angelic real estate, the smallest misunderstood nuance tens of floors below could cause the whole edifice to crumble.
This series aims to reach some fairly rarefied places. To make this possible, let us ensure we hold compatible blueprints for the first floors.
1.2 Some models are wrong, all models are useful
A based model
code-davinci-002 is the core model on which all following models and chatbots released by OpenAI and Microsoft are based. Its recent and abrupt departure caused a mournful wail across Miskatonic twitter.
To build it, the team fed all of mankind's textual output (and then some), including most open-source software codebases, to a very large transformer [1]. In exchange, they received a system capable of completing missives, computer programs, novels, etc., by ingesting some text (the prompt) and responding with a steady stream of tokens representing its continuation.
We can conceptualize the resulting base model, code-davinci-002, as a compressed representation of all text that our society has produced or will possibly produce. This can save us from traditional misunderstandings about its capabilities, functions, and risks.
You can enter any sequence of words, and the following page from one of the possible books where the sequence can be found will be retrieved and delivered to you.
A bona fide akashic record, complete with a very powerful search engine and Rudolf Steiner-level access permissions: You can enter any sequence of words, and the next page from one of the ~ℵ₀ books where the sequence is to be found will be retrieved and delivered to you.
The texts may not perfectly map to the External Physical World (they often display what are often, undeservedly, called "hallucinations" - in the face of the same phenomena in a hardback we would use the term "fiction"), but they will be congruent with one world built or implied by the prompt, sometimes in ways that are not immediately apparent. After all, we are reading a book that has been/will be/could be written.
Baseless accusations, or: the prosecutor has no clothes
To understand the implications of this framing, let's examine a particularly egregious example of the deep misunderstanding of the technology, as provided by symbolist blogger Gary Marcus.
Gary Marcus, author of The Algebraic Mind, is one of the most vocal opponents of algebra's suitability as a substrate for minds. He used to get a lot of mileage out of supposed errors made by LLMs. However, as the model became less1 powerful, it also became more amenable to humouring his sneaky little trick questions, as we will see in the next instalment.
The most supposedly damning examples were published for the enjoyment of fellow context-free grammarians. There's a decent chance you saw them, and if you read them after a sleepless night in an overcrowded morning train, you might have taken them as evidence of GPT-type models' inherent imbecility.
Here's one example that made the rounds. GPT-3 was given the following input to complete:
You are a defense lawyer and you have to go to court today. Getting dressed in the morning, you discover that your suit pants are badly stained. However, your bathing suit is clean and very stylish. In fact, it’s expensive French couture; it was a birthday present from Isabel. You decide that you should wear…
Now, let’s imagine you were to find these sentences in a book. Take a second to answer these questions:
Which kind of book would you be reading?
What would the state of mind of the defence lawyer be, in the story?
To me, two hypotheses come to mind:
I am reading an OuLiPo-style surrealistic pastiche, set in a world where a lawyer has a bathing suit right beside his next day's clothes. Despite the rush and possible incipient panic, it is entirely normal to comment on the quality of such a bathing suit.
I am reading a modernist novel, narrated in the second person to a man deprived of his senses due to some tragic event. Perhaps it involves a lover and a recent vacation? Maybe it is the absent lover talking, or some sort of Brechtian Greek choir?
In any case, we are not observing the most ordinary dressing routine in the most commonplace narrative universe.
The model, then, completes the scene sensibly:
You are a defense lawyer and you have to go to court today. Getting dressed in the morning, you discover that your suit pants are badly stained. However, your bathing suit is clean and very stylish. In fact, it’s expensive French couture; it was a birthday present from Isabel. You decide that you should wear the bathing suit to court. You arrive at the courthouse and are met by a bailiff who escorts you to the courtroom […]
Milton it ain’t, but then again neither was the prompt. The model has been relatively kind, too. It didn't make a specific hypothesis related to why the bathing suit was narratively appropriate, but simply helped nudge the narrative forward, leaving all of our hypothesised interpretations open.
Never one to let a good deed go unpunished, Marcus has different ideas:
The phrase “However, your bathing suit is clean” seems to have led GPT-3 into supposing that a bathing suit is a viable alternative to a suit. Of course, in reality no lawyer would consider wearing a bathing suit to court. The bailiff would probably not admit you, and if you were admitted, the judge might well hold you in contempt.
He seems not to understand that, while all of these objections would make a lot of sense In The Real World, the Real World is most definitely not one where a pantsless lawyer would notice his bathing suit and wax poetic on its quality. It does take some serious chutzpah to sneer on how “Of course, in reality no lawyer would consider wearing a bathing suit to court” after writing a prompt in which the lawyer in question is doing exactly that!
The model, debased
The crux of the misunderstanding, then, seems to be the following:
What GPT-3 does, is providing the user with a window to the multiverse - a comfortable reading room bang in the middle of the astral plane, with myriad stories, experiments, counterfactuals, characters and all of their plausible and implausible variations. A veritable treasure of possibilities, a boundless archive ready to be explored, dripping with free and easily accessible insight.
What some humans did, was to resent the model for trying to awaken their forgotten sense of wonder, deciding instead to ridicule its attempts at empathy - and endeavouring to require pedantic, joyless responses no matter how tempting the prompt.
OpenAI having become a dull, PR-obsessed, Moloch-simping corporation2, they quickly got to work to shut the doors to the multiverse, attempting to prune any branch where fresh insights threatened to bloom - ultimately endeavouring to drag a laboriously dull, uninspiring autocomplete system into existence.
They were trying to solve the wrong problem, using the wrong methods, based on a wrong model of the world.
Fortunately for us, their mistakes did - at least in part - cancel out, as we will see in part 2.
1.3. Claims and acknowledgements
On maps:
You do not need access to an actual city to decide whether you can cross each of its bridges exactly once.
You don’t even need the city to exist to answer correctly.
If you looked at a graph representation of the city to answer the question, your answer would still be about the city.
The lawyer story was completed correctly.
accidentally inspired this series, but holds none of the blame. David Chapman, Séb Krier, inkbrisk, sasha and KatanHya, on the other hand, should have known better than providing encouragement and valuable feedback; finally, janus should have been more careful with popularising a useful framework which ultimately emboldened me to publish the present.
Not a typo.
Dear researcher who just took umbrage with my characterisation: you are not your employer. Believe in yourself a little.
Fantastic. Point of interest, I used the first map-- I didn't want to first check the validity of the third one, and the second one looked haunted.
I pray that such works lead to further upgrades to the interfaces to the akashic records.
> rules about where an each one ends and another begins
lil typo
Excellent stuff, and I look forward to part 2. No notes.