LLMs and the business of “Truth”

Theodorus claims that we are alike. But if we each had a lyre, and he said that they were similarly tuned, would we just take his word for it, or would we first see whether his statement was backed by musical knowledge? - Socrates in The Theaetetus (144d-e, Waterfield translation)

There’s much recently about the new LLM-based GPT and the like. In particular, the fact that they create plausible looking output, which on inspection is utterly baseless.

Simon Willison has a good post (as so often) with links, and a strident conclusion:

There’s a time for linguistics, and there’s a time for grabbing the general public by the shoulders and shouting “It lies! The computer lies to you! Don’t trust anything it says!”

I take Simon’s point here. The urgent thing is getting the message out, not worrying about that the notion of lying is not, in some sense, strictly correct.

Pause and think for a moment: our language is riddled with metaphors, anthropomorphic or not. This ordinarily doesn’t get in the way of communication—in fact, quite the opposite—nor, under any sensitive interpretation, does it inhibit true utterance. (On throwing an insult, I clearly know that my target is not literally a jackass. Nonetheless, I’m asserting it.)

If we banned such usage for supposed lack of (capital letter) Truth, we’d soon find, we ran out of linguistic resources to speak at all. That’s no way forward.

So, let’s go with it, they lie. No problem.

At the same time though, in other contexts, maybe other metaphors might be useful.

In a separate conversation, I mused that LLMs aren’t even in the business of “Truth” at all:

@simon they’re entirely generative no? (They construct any answer as they go, is that right?) That doesn’t seem even lined up for truth, which on most accounts has required at least some element of “checking to see”. 🤔 — Toot

On the first-pass accounts, truth requires some kind of correspondence (however so understood) between an utterance and the world. It’s that that makes the utterance a representation – that it’s of something. It’s only concern for that relation that makes questions of truth or falsity meaningful. (And it’s not always meaningful! It takes a particular kind of Philistine to stand in St Ives and complain, ”But what’s this Barbara Hepworth of?”)

But it’s precisely that concern — for the relation between the output and its domain — that is lacking with LLMs. It’s remarkable they do so well (on just stats) but if you’re not even interested in truth — if you never stop to check — is it any wonder they can go so remarkably wrong? (Wrong, again, not really applying, since it was never really at stake.)

A traditional criticism might have been that the goal was not truth, but something else, persuasion perhaps. But that seems no better. ChatGPT isn’t trying to convince you of anything either. (This being close to the objection to the lying metaphor, I think: it’s not trying to hoodwink you.) The models are just whirling away.

I think an activity it’s very much like is doodling — undirected drawing, without a representational goal. You’re doodling in the meeting, half due to boredom, half to keep yourself awake. Your colleague sees it at the end. It’s not a valid criticism, ”But it didn’t look like that”.

It may not be the headline message we need, but thinking of LLMs as doodling — doodling in the style of their training data — is perhaps a helpful analogy if we need one.

An AI doodling, in the style of Picasso (by Stable Diffusion)

An AI doodling, in the style of Picasso (by Stable Diffusion).