On the Emergence of Thinking in LLMs I: Searching for the Right Intuition Paper • 2502.06773 • Published 1 day ago • 1
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 14 days ago • 102
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 1 day ago • 66
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 20 days ago • 315
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper • 2501.12326 • Published 21 days ago • 49
PaSa: An LLM Agent for Comprehensive Academic Paper Search Paper • 2501.10120 • Published 26 days ago • 43
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning Paper • 2411.03817 • Published Nov 6, 2024 • 1
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning Paper • 2406.11896 • Published Jun 14, 2024 • 20
FAST: Efficient Action Tokenization for Vision-Language-Action Models Paper • 2501.09747 • Published 26 days ago • 23
Do generative video models learn physical principles from watching videos? Paper • 2501.09038 • Published 28 days ago • 32
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper • 2501.09686 • Published 26 days ago • 36
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation Paper • 2501.09755 • Published 26 days ago • 34
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot Paper • 2501.09012 • Published 27 days ago • 10
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper • 2501.08828 • Published 27 days ago • 30
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper • 2501.09751 • Published 26 days ago • 47