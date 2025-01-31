  • Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
DeepSeek Outstrips Meta and Mistral To Lead Open-Source AI Race

Anything open source that threatens the attempted hegemonic AI cabal that seems to be progressing toward ensuring high performance models exist exclusively as proprietary SaaS platforms, is generally a good thing. Now, there are still concerns over having full access and awareness of the data sets used to train the model and issues of censorship (both in the officially hosted instance accessed via API, and more importantly in the model itself), but moving towards a a future where the benefits of AI can be widely distributed with self-hosting of high performance FOSS models and training data, is moving in the right direction. Still lots of work to do, however
 
I think this part is a good reminder about hype-news:

We want to highlight that the narrative has flipped from last month, when scaling laws were broken, we dispelled this myth, now algorithmic improvement is too fast and this too is somehow bad for Nvidia and GPUs.


https://www.ft.com/content/f24ba8d5-4c33-47ef-a91e-8f76340b08c4

Nvidia and the AI boom face a scaling problem
And people thought it was bad for the industry, now it would be the opposite...

We are confident that their GPU investments account for more than $500M US dollars .. They have ~150 employees

The actual article as so little with the resume made of it above that link....

Like the article say, GPT-4o-mini is much cheaper than deepseek V3, gemini 1.5 pro cost around the same ballpark, Gemini nano is cheaper, 93.3% cheaper inference cost is not versus standard methods, it is versus the most expensive models.

We had 2 months ago, openSource model that ran on Laptop that were better than GPT 3 that costed a fortune the at launch (1,200 time more than GPT now), how is this right now any different ?

All the press made around Deepseek, would they have compared with google performance and google pricing, would not have sounded nearly as impressive. google Flash 2.0 is significantly cheaper and beat Deepseek R1 in a lot of things.
 
The coverage of this feel a bit like:

- Lot of people not so long ago pushed a narrative that generative AI , machine learning had hit a wall in 2024, it would require too much hardware, too much power, needed always more data to get better.

Silicon valley giants and small player, all said they were wrong, exact opposite, lot of tricks incoming, pace of cost going down is incredible, obviously very biased people, we can't believe them.

Now that Deepseek R1 come out, all the people instead of going into, the we were wrong and got took by surprise, try to pivot as if Google would have been... the people that wrote and published a lot of the scientific paper that inspired Deepseek and provide really good arguably better LLM at a much cheaper price than DeepSeek do....
 
o3-mini can pretty much match in everything (and often beat it by a lot) the full OpenAI 01 model and is 14 times cheaper and different tier of speed, it is even faster than o1-mini...
https://openai.com/api/pricing/

Does anyone thing that this news (would it have happened next month instead of this month) would have created some discourses about Nvidia hardware... ?
 
Possibly already posted so please let me know:

Last month, Meta admitted to torrenting a controversial large dataset known as LibGen, which includes tens of millions of pirated books. But details around the torrenting were murky until yesterday, when Meta's unredacted emails were made public for the first time. The new evidence showed that Meta torrented "at least 81.7 terabytes of data across multiple shadow libraries through the site Anna’s Archive, including at least 35.7 terabytes of data from Z-Library and LibGen," the authors' court filing said. And "Meta also previously torrented 80.6 terabytes of data from LibGen."

https://arstechnica.com/tech-policy...7tb-of-pirated-books-to-train-ai-authors-say/
 
LukeTbk said:
14 times cheaper [than o1]. Does anyone thing that this news (would it have happened next month instead of this month) would have created some discourses about Nvidia hardware... ?
If accurate about the o3-mini matching the reasoning of the main o1 model then it's a great jump, yeah. From a strictly same-tier perspective o1-mini's pricing was $3 per million input tokens and $12 p/m output tokens, compared to o3-mini's $1.10 and $4.40, respectively.

Though in terms of newsworthiness company API pricing is somewhat arbitrary (OpenAI's CEO has said as much recently, where he mentions one example of A/B price testing differing as greatly as 200% and they're still losing money) so not sure i would have ruffled feathers about hardware costs per se like Deepseek did.

Other aspects are of course are it's still unclear how much Deepseek may have leveraged existing model outputs for bootstrapping the training (since it could be at least partially dependant on the expensive work of other LLMs) along with R1 news being pushed hard, including inorganically, as it became a quasi-political talking point. It was very 'splashy' news.
 
DeepSeek’s cost reduction via their Multi-head Latent Attention approach is technically impressive, but I think the broader conversation here is even more important—how open models challenge the consolidation of power in commercial AI. While there’s still a lot of justified concern about training data transparency and censorship, moves like this show that high-performance alternatives don’t need to be locked behind API paywalls. That said, it’s worth keeping a critical eye on the hype—comparisons to GPT-4 and others should be grounded in practical, consistent benchmarks.
 
Can be a bit exaggerated/hyped (cost reduction vs the established giants) at the time.

gemini_2-5_flash_benchmarks_apr.google.gif


even small 70b distilled Deepseek on groq cost more than 2.5 gemini flash or 4.1 mini, the idea that google was not doing already everything they did and more was always pure speculation (what they did was largely based on Google published papers).

stevemarkovick said:
moves like this show that high-performance alternatives don’t need to be locked behind API paywalls.
Meta already kind of showed that all-along, if you have giant money backing it and some alter-motive....
 
