view article Article What is test-time compute and how to scale it? By Kseniase and 1 other β’ 5 days ago β’ 16
Improving Transformer World Models for Data-Efficient RL Paper β’ 2502.01591 β’ Published 8 days ago β’ 9
Reasoning Datasets Collection Distilled synthetic Reasoning datasets β’ 7 items β’ Updated 10 days ago β’ 50
view article Article SmolVLM Grows Smaller β Introducing the 250M & 500M Models! 20 days ago β’ 124
Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments Paper β’ 2501.10893 β’ Published 24 days ago β’ 23
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper β’ 2501.04519 β’ Published Jan 8 β’ 255
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper β’ 2501.03262 β’ Published Jan 4 β’ 90
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper β’ 2402.03300 β’ Published Feb 5, 2024 β’ 92
view article Article Fine-tune ModernBERT for text classification using synthetic data By davidberenstein1957 β’ Dec 30, 2024 β’ 32
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search Paper β’ 2412.18319 β’ Published Dec 24, 2024 β’ 37
DeTikZify Collection Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ β’ 11 items β’ Updated Dec 4, 2024 β’ 7
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper β’ 2412.06559 β’ Published Dec 9, 2024 β’ 79
view article Article Rethinking Backpropagation: Thoughts on What's Wrong with Backpropagation By Jaward β’ Dec 2, 2024 β’ 5
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper β’ 2412.05271 β’ Published Dec 6, 2024 β’ 130
view article Article πΊπ¦ββ¬ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs By wolfram β’ Dec 4, 2024 β’ 76
Cut Your Losses in Large-Vocabulary Language Models Paper β’ 2411.09009 β’ Published Nov 13, 2024 β’ 45