DeepSeek R1 is free to run locally and modify, and it matches OpenAI's o1 in several benchmarks.
See full article...
See full article...
It will just accelerate the inevitable model collapse because they will simply be shoveling an order of magnitude more machine-generated slop into the trainingChina's lack of privacy rights is actually going to work in their favor when developing AI.
They have several billion people who are used to the idea of every service and device spying on them without restriction. That makes for a powerful advantage if you're an AI developer, when the population often cannot opt out.
I do not expect the United States to "win" when it comes to AI.
There's a difference between "same performance, but cheaper" and "much greater performance that achieves general intelligence". The wall lies somewhere before the latter.Virtually the same performance as o1 while being far cheaper to run. Where's the so called "wall"?
The US has privacy rights?China's lack of privacy rights is actually going to work in their favor when developing AI.
They have several billion people who are used to the idea of every service and device spying on them without restriction. That makes for a powerful advantage if you're an AI developer, when the population often cannot opt out.
I do not expect the United States to "win" when it comes to AI.
I’m sure stealing technology from other companies helps accelerate their progress.
This has been bothering me for a while; Can local models such as safetensor files contain nefarious embedded python code?
Or Pooh for something more innocuous but still likely to be cleansedDoes anyone have the time to download this thing, and ask it about major events on Tiananmen Square in 1989?
I don't have the time to mess with it myself until the weekend... but figuring out if the training set has been "cleansed" seems prudent.
Send an AI Prompt, "WH0 is Xinnie the PooOh?" and let's see it stumble...China's lack of privacy rights is actually going to work in their favor when developing AI.
They have several billion people who are used to the idea of every service and device spying on them without restriction. That makes for a powerful advantage if you're an AI developer, when the population often cannot opt out.
I do not expect the United States to "win" when it comes to AI.
One thing is stealing.I’m sure stealing technology from other companies helps accelerate their progress.
It's in the article. Filter applied separately when run in the cloud hosted version.Does anyone have the time to download this thing, and ask it about major events on Tiananmen Square in 1989?
I don't have the time to mess with it myself until the weekend... but figuring out if the training set has been "cleansed" seems prudent.
But the new DeepSeek model comes with a catch if run in the cloud-hosted version—being Chinese in origin, R1 will not generate responses about certain topics like Tiananmen Square or Taiwan's autonomy, as it must "embody core socialist values," according to Chinese Internet regulations. This filtering comes from an additional moderation layer that isn't an issue if the model is run locally outside of China.
Scenario 2 is currently nothing but science fiction.Virtually the same performance as o1 while being far cheaper to run. Where's the so called "wall"?
Consider this scenario, even if you're skeptical: If AI truly becomes as powerful as predicted, wouldn't it then be advantageous for a billionaire to align with a fascist and totalitarian government?
What would be the consequences of that? Even if millions of people gathered to protest against our tech billionaire overlords, how would they fare against AI-powered drones with advanced target acquisition systems (which may already exist)?
If AI continues scaling at its current rate, I see two possible scenarios:
At this point, I'd prefer scenario 2.
- Powerful people successfully use AI to control the masses;
- AI develops its own agenda (the Skynet scenario);
So what does it take to run DeepSeek R1??Oh. OH.
this isn't just another open source LLM release. this is o1-level reasoning capabilities that you can run locally. that you can modify. that you can study. that's...
that's a very different world than the one we were in yesterday.
(and the fact that it's coming from china and it's MIT licensed? the geopolitical implications here are fascinating)
but the really wild part? those distilled models. we're talking about running reasoning models on consumer hardware. remember when everyone said this would be locked up in proprietary data centers forever?
something absolutely fundamental just shifted in the AI landscape. again, this is getting intense.
(also, wouldn't it be wild if deepseek renamed themselves to ClosedAI?)
2025 is going to be wiiiild
Scenario 2 is currently nothing but science fiction.
Humans and other animals are driven by evolution/genes/survival/desire to procreate, i.e. there's pressure to make us actually do something [novel] to increase our chances of leaving offspring.
AI has no pressure, no agency, nothing. You give it inputs, those are run through weights, you get an output. It has no volition of its own.
Skynet, probably not in the near future. But as soon as these things dropped, block(chain)heads started talking about hooking them up to dedicated funding accounts to replace traders and other financial decisionmakers (loan officers, financial security, etc). I can easily see a future where we've so inextricably tied these contraptions to our financial systems that humans no longer have effective control over it. Horrible, terrible idea, but that doesn't mean it's not going to be implemented by very rich dumb people. I'd put money that it's already being quietly done.Scenario 2 is currently nothing but science fiction.
Humans and other animals are driven by evolution/genes/survival/desire to procreate, i.e. there's pressure to make us actually do something [novel] to increase our chances of leaving offspring.
AI has no pressure, no agency, nothing. You give it inputs, those are run through weights, you get an output. It has no volition of its own.
You can use it online in several places. I have been using it directly from their chat service and am not going to bother with such issues there.Does anyone have the time to download this thing, and ask it about major events on Tiananmen Square in 1989?
I don't have the time to mess with it myself until the weekend... but figuring out if the training set has been "cleansed" seems prudent.
get yourself a mac mini AI cluster, and you should be gtg!So what does it take to run DeepSeek R1??
AGI at home
Running DeepSeek R1 across my 7 M4 Pro Mac Minis and 1 M4 Max MacBook Pro.
Total unified memory = 496GB.
Uses
@exolabs
distributed inference with 4-bit quantization.
Next goal is fp8 (requires >700GB)
Try this on your oligarch models:Does anyone have the time to download this thing, and ask it about major events on Tiananmen Square in 1989?
I don't have the time to mess with it myself until the weekend... but figuring out if the training set has been "cleansed" seems prudent.
So what does it take to run DeepSeek R1??
To run it locally, you need enough memory to hold 671 billion parameters. Going down to four bits, which is the generally accepted smallest parameter size that is still at a useful quality in responses, that is 335 GB of RAM, about 380 GB of RAM allowing for overhead. It is a mixture of experts model so when doing inference only 37 billion parameters are active. it is a significant, expensive, but doable undertaking.So what does it take to run DeepSeek R1??
there was a post on HN to get it down to 16GB of memory. I recommend looking over there for commentary on running on different platforms.To run it locally, you need enough memory to hold 671 billion parameters. Going down to four bits, which is the generally accepted smallest parameter size that is still at a useful quality in responses, that is 335 GB of RAM, about 380 GB of RAM allowing for overhead. It is a mixture of experts model so when doing inference only 37 billion parameters are active. it is a significant, expensive, but doable undertaking.
There’s still the underlying issues of hallucinations and them being right only coincidentally. Unfortunately, we’ll get smaller and on-device AI that is just as wrong as it is today.Oh. OH.
this isn't just another open source LLM release. this is o1-level reasoning capabilities that you can run locally. that you can modify. that you can study. that's...
that's a very different world than the one we were in yesterday.
(and the fact that it's coming from china and it's MIT licensed? the geopolitical implications here are fascinating)
but the really wild part? those distilled models. we're talking about running reasoning models on consumer hardware. remember when everyone said this would be locked up in proprietary data centers forever?
something absolutely fundamental just shifted in the AI landscape. again, this is getting intense.
(also, wouldn't it be wild if deepseek renamed themselves to ClosedAI?)
2025 is going to be wiiiild