Will LLMs keep improving if we throw more compute at them? OpenAI dealmaker thinks so.
See full article...
See full article...
Microsoft CTO Kevin Scott doubled down on his belief that so-called large language model (LLM) "scaling laws" will continue to drive AI progress, despite some skepticism in the field that progress has leveled out. Scott played a key role in forging a $13 billion technology-sharing deal between Microsoft and OpenAI.
Serious analysis aside: a man with that disgrace of a beard should not be in a decision-making position. It's like he's trying to warn us at every level.
Trust me, you'll be able to take off with our arm-attachable wings, we just have to make them bigger!Guy who rents out x machines: The x industry is far from crashing, it is in fact getting ready to really take off.
I'd be hallucinating too if I saw all that stuff they are feeding LLM's from the internet!AI will keep improving, but pure LLM based systems will always be prone to hallucinations and basic failures of reasoning and generalisation.
I think systems that are natively multimodal (like VLMs) may eventually do better, and systems that can engage in multi-step trains or trees of thought, or that try to capture an abstract world model (like Yann Lecun's IJEPA) will move the needle forward as well. Meta cognition and a system of motivations and objectives will help as well, among other things.
But expecting a feed-forward neural network trained on the text-autocomplete task (aka LLM) to become AGI by just feeding it more data and parameters/cores is folly. It will always be an unreliable hallucinating mess, and feeding it enough high quality data to overcome that will be in infeasible, as will running a model large enough.
The MS CTO either doesn't know better, or he's trying to keep investors sweet.
LLM scaling laws refer to patterns explored by OpenAI researchers in 2020 showing that the performance of language models tends to improve predictably as the models get larger (more parameters), are trained on more data, and have access to more computational power (compute). The laws suggest that simply scaling up model size and training data can lead to significant improvements in AI capabilities without necessarily requiring fundamental algorithmic breakthroughs.
If you haven't, you should read I am a Strange Loop, Douglas Hofstadter's exposition of his theory of consciousness. It's quite similar, and based around the idea of recursive mental models.Self awareness is bootstrapped when your world model contains representations of other agent's world models that contain you. "If I take this action, they will think this of me..."
Which suggests that without a fundamental change in the algorithms/approach, this technique is approaching fairly hard boundaries in the next 10-20 years.From https://www.astralcodexten.com/p/sam-altman-wants-7-trillion
GPT-5 might need about 1% the world’s computers, a small power plant’s worth of energy, and a lot of training data.
GPT-6 might need about 10% of the world’s computers, a large power plant’s worth of energy, and more training data than exists. Probably this looks like a town-sized data center attached to a lot of solar panels or a nuclear reactor.
GPT-7 might need all of the world’s computers, a gargantuan power plant beyond any that currently exist, and way more training data than exists. Probably this looks like a city-sized data center attached to a fusion plant.
Building GPT-8 is currently impossible. Even if you solve synthetic data and fusion power, and you take over the whole semiconductor industry, you wouldn’t come close. Your only hope is that GPT-7 is superintelligent and helps you with this, either by telling you how to build AIs for cheap, or by growing the global economy so much that it can fund currently-impossible things.
I wonder how colossal of a bad investment Microsoft would need to make to sink the company's finances.![]()
I don't think there's been anything paradigm-shifting, it's more of a difference in subjective quality. It's easier to see when you compare two models side-by-side with the same inputs.LLMs have "improved" in the last few years in ways that have given them more capabilities? I've seen the benchmarks about how every new AI trounces its competitors, only to go on to solve no new tasks that the old model was unsuitable for. I've seen articles about how you can coax ChatGTP to solve some math problems if you really put in the work. I've seen articles about how LLMs have achieved passing grades on the bar exam, only to get lawyers in serious trouble as ChatGTP cites entirely made up court cases. I've seen weather prediction AI achieve some higher fidelity and accuracy forecasts, with the caveat that "We wouldn't trust this thing to predict novel extremes and unusual weather patterns." I've seen news about professors failing classes after simply asking ChatGTP if the AI wrote the papers for his class. And I've seen Google's Bard tell people to put glue in their pizza cheese.
Has any of this changed? Am I missing some novel capability that AI has enabled beyond making propaganda spam easier?
I wonder what he was saying or hearing in that picture, because those are not happy eyes he has.Will LLMs keep improving if we throw more money at them? OpenAI dealmaker has too much sunk cost to say no.
I wonder how colossal of a bad investment Microsoft would need to make to sink the company's finances.![]()