Meta dropped Llama 3 on April 18th and the open source AI community responded immediately. Within 24 hours it was the top trending repository on GitHub. Within a week it was running on everything from MacBook Pros to enterprise GPU clusters. Here is why this one is different from previous open source model releases.

The capability jump

Llama 2 was a solid open source model but nobody was seriously claiming it competed with GPT-3.5 Turbo, let alone GPT-4. It had a ceiling that made it practical for specific use cases but not a genuine alternative for demanding tasks. Llama 3 70B crossed a threshold. Multiple independent evaluations put it above GPT-3.5 Turbo on reasoning and coding benchmarks. That matters because GPT-3.5 Turbo powered ChatGPT for most of its first year. The capability that amazed people in late 2022 is now open source and downloadable.

The training approach

Meta made significant investments in data quality for Llama 3. They filtered the training data to a higher standard of factual accuracy and removed low-quality web content more aggressively than previous versions. The instruction tuning for the instruct variant also improved substantially. The models are better at following multi-step instructions and less likely to confabulate on common knowledge questions.

Meta also announced that 400B+ parameter variants are in training and will release later in 2024. That is the number that would genuinely challenge GPT-4 class models. Llama 3 as released is a step change. Llama 3 with the larger variant will be the real test of whether open source can match the frontier.

The ecosystem response

Ollama, llama.cpp, HuggingFace, Groq, Together AI, Replicate: every major inference platform had Llama 3 running within hours of the weights release. The ecosystem has matured enough that a new model release immediately translates into deployment options across local, cloud, and API form factors.

Meta's decision to maintain open weights releases creates a flywheel. The ecosystem investment in tooling, hosting, and fine-tuning for Llama models compounds with each release. That is a meaningful advantage over closed models and it is accumulating.