Flyfish Xu's picture

2 3

Flyfish Xu

flyfishxu

·

https://flyfishxu.com

flyfishxu

AI & ML interests

Pytorch, MLX, Apple Silicon

Recent Activity

updated a model about 19 hours ago

mlx-community/DeepSeek-R1-2bit

liked a model 1 day ago

mlx-community/DeepSeek-R1-2bit

reacted to s-emanuilov's post with 🔥 1 day ago

Tutorial 💥 Training a non-English reasoning model with GRPO and Unsloth I wanted to share my experiment with training reasoning models in languages other than English/Chinese. Using Llama 3.1 8B as base, GRPO trainer from trl, and Unsloth optimizations, I got a working prototype in Bulgarian after ~5 hours on an L40S GPU. The approach should work for any language where the base model has some pre-training coverage. Full code and tutorial here: https://unfoldai.com/reasoning-in-a-non-english-language/ The model itself: https://huggingface.co/s-emanuilov/LLMBG-Llama-3.1-8B-BG-Reasoning-v0.1 I hope this helps anyone looking to build reasoning models in their language.

View all activity

Organizations

flyfishxu's activity

liked a model 1 day ago

mlx-community/DeepSeek-R1-2bit

Updated 1 day ago • 2

liked 2 models 3 months ago

google/gemma-2-27b-it

Text Generation • Updated Aug 27, 2024 • 215k • • 518

meta-llama/Llama-3.1-8B-Instruct

Text Generation • Updated Sep 25, 2024 • 5.89M • • 3.6k