PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 9 items • Updated about 21 hours ago • 59
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 16 days ago • 99
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 28 days ago • 142
view article Article Introducing smolagents: simple agents that write actions in code. Dec 31, 2024 • 594
SmolVLM Collection State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct • 5 items • Updated Dec 22, 2024 • 33
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 16 days ago • 337
MARS: Unleashing the Power of Variance Reduction for Training Large Models Paper • 2411.10438 • Published Nov 15, 2024 • 13
Multimodal Autoregressive Pre-training of Large Vision Encoders Paper • 2411.14402 • Published Nov 21, 2024 • 43
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 5 days ago • 227
Granite 3.0 Language Models Collection A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated Dec 18, 2024 • 96
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22, 2024 • 126