Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
2
3
Flyfish Xu
flyfishxu
Follow
0 followers
·
5 following
https://flyfishxu.com
flyfishxu
AI & ML interests
Pytorch, MLX, Apple Silicon
Recent Activity
updated
a model
about 19 hours ago
mlx-community/DeepSeek-R1-2bit
liked
a model
1 day ago
mlx-community/DeepSeek-R1-2bit
reacted
to
s-emanuilov
's
post
with 🔥
1 day ago
Tutorial 💥 Training a non-English reasoning model with GRPO and Unsloth I wanted to share my experiment with training reasoning models in languages other than English/Chinese. Using Llama 3.1 8B as base, GRPO trainer from trl, and Unsloth optimizations, I got a working prototype in Bulgarian after ~5 hours on an L40S GPU. The approach should work for any language where the base model has some pre-training coverage. Full code and tutorial here: https://unfoldai.com/reasoning-in-a-non-english-language/ The model itself: https://huggingface.co/s-emanuilov/LLMBG-Llama-3.1-8B-BG-Reasoning-v0.1 I hope this helps anyone looking to build reasoning models in their language.
View all activity
Organizations
flyfishxu
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
1 day ago
mlx-community/DeepSeek-R1-2bit
Updated
1 day ago
•
2
liked
2 models
3 months ago
google/gemma-2-27b-it
Text Generation
•
Updated
Aug 27, 2024
•
215k
•
•
518
meta-llama/Llama-3.1-8B-Instruct
Text Generation
•
Updated
Sep 25, 2024
•
5.89M
•
•
3.6k