29
u/Cosmic__Guy 1d ago
Meta caught everyone off guard, it came out of nowhere. Open source is back, baby!
24
u/Aaco0638 1d ago
How? This doesn’t compete with 2.5 pro which is free and google is close to releasing 2.5 flash (if the model in the arena is 2.5 flash which it seems so)
Maybe for open source yeah but it didn’t catch everyone off guard.
24
u/LmaoMyAssIsBig 1d ago
2.5 pro is a reasoning model, these are base model. How can a base model competes with a reasoning model? Mark said that there will be llama 4 reasoning released later, maybe they are waiting for R2 to drop.
5
1
u/Seeker_Of_Knowledge2 4h ago
It is open source, as long as the open source is not abandoned. It is good.
Also, wait for their reasoning model to compete.
0
1d ago
[deleted]
2
u/NaoCustaTentar 22h ago edited 22h ago
If it's free on a free website it's a free model lol
If it also gives some messages for free on the app, it's already better than any other sota model. 3.7 thinking and the best openai models give you 0.
Not to mention it's by far the cheapest model IF you decide to pay... I get 2tb of Google drive/Google photo and the implementation of Gemini in all Google apps for R$ 48,90 (Not to mention the months of free trial just by rotating accounts... Damn near 1 year of all that for free btw before I ran out of accounts from the family groups xD).
OpenAI and Claude are both R$ 100+ here, never had any discounts or free trials, and no other benefits with it.
1
u/jazir5 22h ago
I wasn't speaking to the merits, I was clarifying that to the vast majority of the public, it is not free to use. The amount of people who know about AI studio is vanishingly small and almost all of them are devs.
The API is definitely free for everyone on AI Studio, but the Gemini AI chat service which competes with ChatGPT which is what the actual users use is 100% not fully free. You do understand the distinction between a consumer and a developer right?
1
-1
u/FinBenton 1d ago
Doesnt seem to be anything too special, hopefully they will have smaller versions that are good though.
-13
u/Conscious-Jacket5929 1d ago
nothing impressive
25
u/imDaGoatnocap ▪️agi will run on my GPU server 1d ago
It's open source
2
-3
u/saltyrookieplayer 1d ago
not a lot of people will be able to run this model locally anyway, at that point does it even matter
3
-7
u/peter_wonders ▪️LLMs are not AI, o3 is not AGI 1d ago edited 1d ago
It seems like everyone has the same secret sauce, so at this point, they are most likely just drip-feeding us updates. I cease to care anymore. Ain't nothing special. I bet everyone in Silicon Valley is snitching, too, so they know each other's schedule. It's like Marvel movies at this point. Hard pass.
7
u/Tobio-Star 1d ago edited 1d ago
We clearly need new architectures but this kind of update still excites me for some reason
-4
u/peter_wonders ▪️LLMs are not AI, o3 is not AGI 1d ago
I just don't like the fact that they're playing catch with each other and trip on the set all the time (like Logan, who went to Google after an OpenAI stint).
6
u/Hodr 1d ago
Bro. Cease. You almost broke my brain.
1
u/peter_wonders ▪️LLMs are not AI, o3 is not AGI 1d ago
It broke mine too 😂 I'm sorry, I already edited the comment before I noticed yours.
2
u/oldjar747 1d ago
Yeah haven't really been wowed by LLMs since original GPT-4. And since then a few image or image-to-video models, and multimodality. Operator was pretty cool but isn't under wide release. Don't think there's been enough focus on RAG integration. I think long context is an unnecessary distraction when RAG works just as well. The vast majority of use context a model uses is under 32K tokens, and so models themselves should be tuned for performance here.
3
u/Neurogence 1d ago
Well said. Llama 4 could have had a context of 10 billion and it would still be mostly useless. People here are too easily impressed.
1
u/oldjar747 1d ago
What I've thought about is like a dynamic form of RAG that could improve performance and answer quality over naive RAG or naive context. Say you've got 10 million total tokens in your RAG database. Also say the model's context works best at 32k tokens. So you input a prompt, then the RAG implementation is called. The RAG system shouldn't return its entire 10 million context but rather return the most relevant 32K tokens (or whatever threshold is set) relevant to the prompt. I'm a big believer that highly relevant context is much stronger and will produce better answers than naive long context.
1
1
55
u/The_Architect_032 ♾Hard Takeoff♾ 1d ago
All these "not that special" guys in the comments seem awfully suspicious... Why downplay a free open source model that beats every other model? Or more likely comes close to equal to because I don't trust benchmarks, but still, it's open source, multimodal, and beats DeepSeek.