🔥 Video AI is taking over! Out of 17 papers dropped on Hugging Face today, 6 are video-focused - from Sliding Tile Attention to On-device Sora. The race for next-gen video tech is heating up! 🎬🚀
📢 SmolLM2 paper released! Learn how the 🤗 team built one of the best small language models: from data choices to training insights. Check out our findings and share your thoughts! 🤏💡
Xwen 🔥 a series of open models based on Qwen2.5 models, developed by a brilliant research team of PhD students from the Chinese community. shenzhi-wang/xwen-chat-679e30ab1f4b90cfa7dbc49e ✨ 7B/72B ✨ Apache 2.0 ✨ Xwen-72B-Chat outperformed DeepSeek V3 on Arena Hard Auto
Small but mighty: 82M parameters, runs locally, speaks multiple languages. The best part? It's Apache 2.0 licensed! This could unlock so many possibilities ✨
🚀 The open source community is unstoppable: 4M total downloads for DeepSeek models on Hugging Face, with 3.2M coming from the +600 models created by the community.
Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:
- Original release: 8 models, 540K downloads. Just the beginning...
- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5M—nearly 5X the originals.
The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.
When you empower builders, innovation explodes. For everyone. 🚀
The most popular community model? @bartowski's DeepSeek-R1-Distill-Qwen-32B-GGUF version — 1M downloads alone.
✨ Launched All-Scenario Reasoning Model (language, visual, and search reasoning capabilities) , with medical expertise as one of its key highlights. https://ying.baichuan-ai.com/chat
✨ Released Baichuan-M1-14B Medical LLM on the hub Available in both Base and Instruct versions, support English & Chinese.