view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand 3 days ago • 40
Running on CPU Upgrade Featured 2.53k The Smol Training Playbook 📚 2.53k The secrets to building world-class LLMs
Apertus LLM Collection Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages • 4 items • Updated Oct 1 • 304
gpt-oss Collection Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. • 2 items • Updated Aug 7 • 389
view article Article OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models Jul 18 • 50