GPT-4.5 offers marginal gains in capability and poor coding performance despite 30x the cost.
See full article...
See full article...
Sounds like this was written by one of my old project managers famous for nebulous KPIs."Everything is a little bit better and it's awesome," he wrote, "but also not exactly in ways that are trivial to point to."
"It is the first model that feels like talking to a thoughtful person to me," he wrote. He then added further down in his post, "a heads up: this isn't a reasoning model and won't crush benchmarks. it's a different kind of intelligence and there's a magic to it i haven't felt before."
OpenAI has likely known about diminishing returns in training LLMs for some time.
So we're officially at "healing crystal woo" of selling AI.Upon 4.5's release, OpenAI CEO Sam Altman did some expectation tempering on X, writing that the model is strong on vibes but low on analytical strength. "It is the first model that feels like talking to a thoughtful person to me," he wrote. He then added further down in his post, "a heads up: this isn't a reasoning model and won't crush benchmarks. it's a different kind of intelligence and there's a magic to it i haven't felt before."
It sounds to me what people said with GPT-4.0 vs GPT-3.5.That sounds like what people said with Eliza or anything that speaks in natural sounding language.
What kind of intelligence is it? Street smarts? Can it survive on Skid Row?
"Everything is a little bit better and it's awesome," he wrote, "but also not exactly in ways that are trivial to point to."
Are you suggesting co-marketing with Goop? It would make a fitting bookend to the human story.So we're officially at "healing crystal woo" of selling AI.
AI-powered vagina eggs - when you need to hold the future close.Are you suggesting co-marketing with Goop? It would make a fitting bookend to the human story.
"Whatever happened to those humans?"Are you suggesting co-marketing with Goop? It would make a fitting bookend to the human story.
It is indeed very easy to profit on something when you got it for free then turned around and charged money for it.He is wrong. People are profiting on serving DeepSeek R1 as well as distilled models.
“Once, men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them.”"Whatever happened to those humans?"
Oh, they delegated all cognitive effort to black boxes they didn't understand. The black boxes said "drink raw milk and kill urself, nerds", and, because humans made sure the electric rock's highest stat was charisma, they listened!
This would funnier if we didn't just hand HHS over to RFK."Whatever happened to those humans?"
Oh, they delegated all cognitive effort to black boxes they didn't understand. The black boxes said "drink raw milk and kill urself, nerds", and, because humans made sure the electric rock's highest stat was charisma, they listened!
Incidentally, 'Zitron' means 'lemon' in German."An AI expert who requested anonymity told Ars Technica, "GPT-4.5 is a lemon!" when comparing its reported performance to its dramatically increased price, while frequent OpenAI critic Gary Marcus called the release a "nothing burger" in a blog post (though to be fair, Marcus also seems to think most of what OpenAI does is overrated)."
Edward Zitron has been pounding the drum in 20K word posts for a while now as well, basically saying AI:
has no viable consumer product
Is currently losing money on each user and loses more money for each new user added
Has largely run out of new data to scrape
Massively exponential increased costs do for each new generation do not notably improve quality of results
Quality of results have never met a reasonable level and throwing more "compute" (as he calls it) only increases costs
His lates missive...for someone who has 35 minutes to blow: https://www.wheresyoured.at/wheres-the-money/?ref=ed-zitrons-wheres-your-ed-at-newsletter
What are the value and unit of measurement that describe one model giving massively better results than another?There hasn't been 'massive exponential increased costs'. Indeed often the opposite, for the same given compute budget you get massively better results.
Elo, benchmarks, reputation, and if you use them regularly, you find out yourself.What are the value and unit of measurement that describe one model giving massively better results than another?
The problem is they’ve already scraped almost all the world’s data, including from TikTok. There is literally not enough remaining human-produced content in the entire world for the models to continue scaling up, which is why they’re turning to synthetic data.What's working in favor of China is TikTok et all. Chinese companies have a never ending funnel of new data they can use to improve the system with zero cost and zero morale objections. Why bother with synthetic data? If it's any good, the Americans will use it, the Chinese can steal it (again....again), and augment all the other data they're being fed.
It's like the "Food Chain" from the Simpsons back in the 1990s. Except the human is Chinese AI companies and all the animals are the rest of the world they're stealing from.
View attachment 103862
Shorting Nvidia seems like the obvious move here.I'm curious to see when the gravy train runs out of track. I imagine it will be similar to the dawn of the century: investors will eventually slow or stop the money train, companies will fold, the secondhand market will be flooded with high-end hardware so new startups and remaining competitors don't have to buy new, hardware companies see sales go off a cliff, pandemonium ensues.
Plus ça change, plus c'est la même chose.
What's working in favor of China is TikTok et all. Chinese companies have a never ending funnel of new data they can use to improve the system with zero cost and zero morale objections. Why bother with synthetic data? If it's any good, the Americans will use it, the Chinese can steal it (again....again), and augment all the other data they're being fed.
It's like the "Food Chain" from the Simpsons back in the 1990s. Except the human is Chinese AI companies and all the animals are the rest of the world they're stealing from.
View attachment 103862
There's a very important part of software tools that you're missing with this point: Software can and will be treated as an unerring authority if it takes more than a second to realize it produced an incorrect answer. People do not treat software tools like Doctor Jim who sometimes has a long day, who sometimes misunderstands what you meant by "minor bleeding."[...]
They are comparable to human experts at many tasks for a tiny percentage for inference cost, people complain about hallucination but often hallucination is lower than human experts on the same task (Ie MDs doing a differential diagnosis). Progress continues as we devise new benchmark test sets that aren't saturated.
This may be true for 'public' data that anyone can get, but this data is a fraction of 'human produced content', and is the least useful as its full of garbage. The truly valuable and curated stuff is behind corporate walls and not available to neither OpenAI, nor any of their competitors.The problem is they’ve already scraped almost all the world’s data, including from TikTok. There is literally not enough remaining human-produced content in the entire world for the models to continue scaling up, which is why they’re turning to synthetic data.
(Which is just a fancy way of saying “we’re feeding our bullshit machine the bullshit another bullshit machine produced”)
An AI expert who requested anonymity told Ars Technica, "GPT-4.5 is a lemon!"
Ya don’t say?Satya Nadella says AI is yet to find a killer app that matches the combined impact of email and Excel
Any time someone quotes a Rush song in Ars is a good day.Plus ça change, plus c'est la même chose.