What Is ChatGPT Doing ... And Why Does It Work?

What Is ChatGPT Doing ... And Why Does It Work?

What Is ChatGPT Doing ... And Why Does It Work?

Stephen Wolfram

Saved by David Faulk and

RelatedInsightsCollectionsHighlights

Now it’s even less clear what the “right answer” is. What about a dog dressed in a cat suit? Etc. Whatever input it’s given the neural net will generate an answer, and in a way reasonably consistent with how humans might. As I’ve said above, that’s not a fact we can “derive from first principles”. It’s just something

Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?

N

Nina Kastenauer

And that’s exactly what we’re seeing in the examples above. ChatGPT does great at the “human-like parts”, where there isn’t a precise “right answer”. But when it’s “put on the spot” for something precise, it often falls down. But the whole point here is that there’s

Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?

N

Nina Kastenauer

It operates in three basic stages. First, it takes the sequence of tokens that corresponds to the text so far, and finds an embedding (i.e. an array of numbers) that represents these. Then it operates on this embedding—in a “standard neural net way”, with values “rippling through” successive layers in a network—to

Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?

N

Nina Kastenauer

if one could just go on and “train a big enough network” one would be able to do absolutely anything with it. But it won’t work that way. Fundamental facts about computation—and notably the concept of computational irreducibility—make it clear it ultimately can’t.

Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?

N

Nina Kastenauer

Well, it should be a list of 50,000 or so numbers that effectively give the probabilities for each of the possible “fill-in” words. And once again, to find an embedding, we want to “intercept” the “insides” of the neural net just before it “reaches its conclusion”—and then pick up the list of numbers that occur there,

Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?

N

Nina Kastenauer

The original input to ChatGPT is an array of numbers (the embedding vectors for the tokens so far), and what happens when ChatGPT “runs” to produce a new token is just that these numbers “ripple through” the layers of the neural net, with each neuron “doing its thing” and passing the result to neurons on the next layer. There’s no looping or “going

... See more

Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?

N

Nina Kastenauer

see “how similar” the “environments” are in which different words appear. So, for example, “alligator” and “crocodile” will often appear almost interchangeably in otherwise similar sentences, and that means they’ll be placed nearby in the embedding. But “turnip” and “eagle” won’t tend to appear in otherwise similar sentences, so they’ll be placed

... See more

Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?

N

Nina Kastenauer

Human language is fundamentally imprecise, not least because it isn’t “tethered” to a specific computational implementation, and its meaning is basically defined just by a “social contract” between its users. But computational

Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?

N

Nina Kastenauer

produce a new embedding (i.e. a new array of numbers). It then takes the last part of this array and generates from it an array of about 50,000 values that turn into probabilities for different possible next tokens. (And, yes, it so happens that there are about the same number of tokens used as there are common words in English, though only about

... See more

Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?

N

Nina Kastenauer