Simeon Emanuilov PRO

s-emanuilov

AI & ML interests

Software Engineer & Ph.D. candidate | Specializing in ML/DL system development & applying AI to solve real-world business problems.

Recent Activity

replied to their post about 3 hours ago

Tutorial 💥 Training a non-English reasoning model with GRPO and Unsloth I wanted to share my experiment with training reasoning models in languages other than English/Chinese. Using Llama 3.1 8B as base, GRPO trainer from trl, and Unsloth optimizations, I got a working prototype in Bulgarian after ~5 hours on an L40S GPU. The approach should work for any language where the base model has some pre-training coverage. Full code and tutorial here: https://unfoldai.com/reasoning-in-a-non-english-language/ The model itself: https://huggingface.co/s-emanuilov/LLMBG-Llama-3.1-8B-BG-Reasoning-v0.1 I hope this helps anyone looking to build reasoning models in their language.

upvoted a paper 1 day ago

Fast Video Generation with Sliding Tile Attention

replied to their post 2 days ago

View all activity

Organizations

s-emanuilov's activity

upvoted a paper 1 day ago

Fast Video Generation with Sliding Tile Attention

Paper • 2502.04507 • Published 5 days ago • 42

upvoted a paper 2 days ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 90

upvoted a paper 6 days ago

Executable Code Actions Elicit Better LLM Agents

Paper • 2402.01030 • Published Feb 1, 2024 • 62

upvoted an article 6 days ago

Article

Open-source DeepResearch – Freeing our search agents

8 days ago

• 906

upvoted a collection 8 days ago

llama.vim

Collection

upvoted an article 9 days ago

Article

Finally, a Replacement for BERT: Introducing ModernBERT

Dec 19, 2024

• 531

upvoted an article 14 days ago

Article

Welcome to Inference Providers on the Hub 🔥

15 days ago

• 319

upvoted a collection 16 days ago

Qwen2.5-1M

Collection

The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 16 days ago • 99

upvoted an article 19 days ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

20 days ago

• 124

upvoted a paper 19 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 20 days ago • 314

upvoted 2 papers 20 days ago

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published 22 days ago • 90

Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement

Paper • 2501.12273 • Published 21 days ago • 14

upvoted an article 22 days ago

Article

Yay! Organizations can now publish blog Articles

and 3 others •

22 days ago

• 33

upvoted a collection 22 days ago

DeepSeek R1 (All Versions)

Collection

DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 3 days ago • 167

upvoted a paper 22 days ago

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published 26 days ago • 105

upvoted a collection 25 days ago

Jan 17 Releases ❄️

Collection

Models and datasets of the second week of Jan 2025. • 23 items • Updated 25 days ago • 10

upvoted 2 papers 25 days ago

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published 26 days ago • 67

OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking

Paper • 2501.09751 • Published 26 days ago • 47

upvoted 2 papers 26 days ago

MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents

Paper • 2501.08828 • Published 27 days ago • 30

RepVideo: Rethinking Cross-Layer Representation for Video Generation

Paper • 2501.08994 • Published 27 days ago • 15