Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 1 day ago • 53
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published 4 days ago • 40
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 Paper • 2502.03544 • Published 6 days ago • 37
Great Models Think Alike and this Undermines AI Oversight Paper • 2502.04313 • Published 5 days ago • 24
TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets Paper • 2502.01506 • Published 8 days ago • 31
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 7 days ago • 153
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding Paper • 2412.10302 • Published Dec 13, 2024 • 16
Reward-Guided Speculative Decoding for Efficient LLM Reasoning Paper • 2501.19324 • Published 11 days ago • 34
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Paper • 2501.18585 • Published 12 days ago • 51
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer Paper • 2501.18427 • Published 12 days ago • 16
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch Paper • 2501.18512 • Published 12 days ago • 25
Large Language Models Think Too Fast To Explore Effectively Paper • 2501.18009 • Published 13 days ago • 22
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Paper • 2501.18511 • Published 12 days ago • 17
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published 14 days ago • 33
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 14 days ago • 101