What Is ChatGPT Doing ... And Why Does It Work?
But what was found is that—at least for “human-like tasks”—it’s usually better just to try to train the neural net on the “end-to-end problem”, letting it “discover” the necessary intermediate features, encodings, etc. for itself.
Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?
But if we want to work out things that are in the purview of mathematical or computational science the neural net isn’t going to be able to do it—unless it effectively “uses as a tool” an “ordinary” computational system.
Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?
What about ChatGPT directly learning Wolfram Language? Well, yes, it could do that, and in fact it’s already started. And in the end I fully expect that something like ChatGPT will be able to operate directly in Wolfram Language, and be very powerful in doing so. It’s an
Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?
Because in the end what we’re dealing with is just a neural net made of “artificial neurons”, each doing the simple operation of taking a collection of numerical inputs, and then combining them with certain weights.
Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?
language, by its nature, has a certain fundamental precision—because in the end what it specifies can always be “unambiguously executed on a computer”. Human language can usually get away with a certain vagueness. (When we say “planet” does it include exoplanets or not, etc.?) But in computational language we have to be precise and clear about all
... See moreStephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?
And with computers we can readily do long, computationally irreducible things.
Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?
and what’s a dog, and then have the network “machine learn” from these how to distinguish them. And the point is that the trained network “generalizes” from the particular examples it’s shown. Just as we’ve seen above, it isn’t simply that the network recognizes the particular pixel pattern of an example cat image it was shown; rather it’s that the
... See moreStephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?
OK, so what do the attention heads do? Basically they’re a way of “looking back” in the sequence of tokens (i.e. in the text produced so far), and “packaging up the past” in a form that’s useful for finding the next token.
Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?
(Strictly, ChatGPT does not deal with words, but rather with “tokens”—convenient linguistic units that might be whole words, or might just be pieces like “pre” or “ing” or “ized”. Working with tokens makes it easier for ChatGPT to
Stephen Wolfram • What Is ChatGPT Doing ... And Why Does It Work?
transformers instead introduce the notion of “attention”—and the idea of “paying attention” more to some parts of the sequence than others.