Victor Mustar's picture

Victor Mustar PRO

victor

·

victormustar

AI & ML interests

Building the UX of this website

Recent Activity

liked a model about 7 hours ago

tomg-group-umd/huginn-0125

liked a Space about 8 hours ago

FunAudioLLM/InspireMusic

liked a model about 17 hours ago

ibm-granite/granite-vision-3.1-2b-preview

View all activity

Organizations

victor's activity

upvoted a paper 1 day ago

PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models

Paper • 2502.01584 • Published 8 days ago • 9

upvoted an article 5 days ago

Article

G2P Shrinks Speech Models

By

•

6 days ago

• 23

upvoted a paper 5 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 7 days ago • 153

upvoted an article 5 days ago

Article

Mastering Long Contexts in LLMs with KVPress

By

and 1 other •

19 days ago

• 62

upvoted 4 papers 7 days ago

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published 13 days ago • 51

s1: Simple test-time scaling

Paper • 2501.19393 • Published 11 days ago • 98

OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published 8 days ago • 168

The Differences Between Direct Alignment Algorithms are a Blur

Paper • 2502.01237 • Published 8 days ago • 109

upvoted an article 9 days ago

Article

Open-R1: Update #1

By

and 7 others •

10 days ago

• 268

upvoted a paper 11 days ago

GuardReasoner: Towards Reasoning-based LLM Safeguards

Paper • 2501.18492 • Published 12 days ago • 80

upvoted 3 articles 11 days ago

Article

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents

By

•

13 days ago

• 16

Article

🅰️ℹ️ 1️⃣0️⃣1️⃣ The Keys to Prompt Optimization

By

and 1 other •

13 days ago

• 4

Article

Anthropic CEO: is DeepSeek-R1 a revolution in AI?

By

•

12 days ago

• 6

upvoted a collection 11 days ago

R1 Multilingual

5 items • Updated 11 days ago • 9

upvoted a collection 12 days ago

Tulu 3 Models

All models released with Tulu 3 -- state of the art open post-training recipes. • 10 items • Updated about 17 hours ago • 90

upvoted a paper 14 days ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published 17 days ago • 54

upvoted 2 articles 14 days ago

Article

Welcome to Inference Providers on the Hub 🔥

15 days ago

• 319

Article

Open-R1: a fully open reproduction of DeepSeek-R1

15 days ago

• 705

upvoted a collection 15 days ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 3 items • Updated 15 days ago • 336

upvoted a paper 15 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 20 days ago • 314