SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 7 days ago • 153
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 28 days ago • 54
most ducked models 🦆🦆🦆 Collection https://x.com/jeremyphoward/status/1881264223646576786 • 5 items • Updated 22 days ago • 3
view article Article 🐺🐦⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark By wolfram • Jan 2 • 39
view article Article Fine-tune a SmolLM on domain-specific synthetic data from a LLM By davidberenstein1957 • Jan 3 • 32
view article Article Bridging the Gap Between Physical Numerical Simulations and Machine Learning: Introducing The Well By rubenohana • Dec 2, 2024 • 17
view article Article Halo: Open Source Health Tracking with Wearables By cyrilzakka • Nov 19, 2024 • 106
view article Article Releasing Outlines-core 0.1.0: structured generation in Rust and Python Oct 22, 2024 • 44
view article Article Democratization of AI, Open Source, and AI Auditing: Thoughts from the DisinfoCon Panel in Berlin By frimelle • Oct 8, 2024 • 6
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated about 17 hours ago • 294
MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures Paper • 2406.06565 • Published Jun 3, 2024 • 9
🎭 Avatars Collection The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 70 items • Updated Dec 24, 2024 • 81
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25, 2024 • 91