25
A prevailing sentiment online is that GPT-4 still does not understand what it talks about. We can argue semantics over what “understanding” truly means. I think it’s useful, at least today, to draw the line at whether GPT-4 has succesfully modeled parts of the world. Is it just picking words and connecting them with correct grammar? Or does the token selection actually reflect parts of the physical world?
One of the most remarkable things I’ve heard about GPT-4 comes from an episode of This American Life titled “Greetings, People of Earth”.
Preservation only but not likely any better than a linguistic historian.
But it gets tricky because LLMs only function on HUGE sets of data. LLMs are nothing more than complicated probability engines. Give it the question “What color is the sky?” and the math extracted from the massive databases that it has says the highest probability answer is “Blue”. It doesn’t actually KNOW the answer it just knows the probabilities of different words.
Without large amounts of data on the dying language current gen LLM’s won’t be accurate or able to generate reliable answers. Shoot… LLMs can barely generate reliable answers with the massive datasets they currently have.
I strongly recommend anyone even remotely interested in LLMs to read this interactive article:
https://ig.ft.com/generative-ai/
This is a great article, thanks for linking it!