view article Article From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages By Steveeeeeeen and 1 other • about 10 hours ago • 15
SigLIP Collection Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 • 10 items • Updated Dec 13, 2024 • 51
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency By not-lain • 13 days ago • 24
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 4 days ago • 167
Qwen2.5 Collection The Qwen 2.5 models are a series of AI models trained on 18 trillion tokens, supporting 29 languages and offering advanced features such as instructio • 33 items • Updated Oct 12, 2024 • 7
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 16 days ago • 337
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published Nov 15, 2024 • 73
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • 19 days ago • 62
InternVL2.5-MPO Collection Enhancing the Reasoning Ability of MLLMs via Mixed Preference Optimization • 16 items • Updated 13 days ago • 26
view article Article Yay! Organizations can now publish blog Articles By huggingface and 3 others • 22 days ago • 33