Gilbert Bands PRO
dbands
AI & ML interests
None yet
Recent Activity
replied to
s-emanuilov's
post
2 days ago
Tutorial ๐ฅ Training a non-English reasoning model with GRPO and Unsloth
I wanted to share my experiment with training reasoning models in languages other than English/Chinese.
Using Llama 3.1 8B as base, GRPO trainer from trl, and Unsloth optimizations, I got a working prototype in Bulgarian after ~5 hours on an L40S GPU. The approach should work for any language where the base model has some pre-training coverage.
Full code and tutorial here: https://unfoldai.com/reasoning-in-a-non-english-language/
The model itself: https://huggingface.co/s-emanuilov/LLMBG-Llama-3.1-8B-BG-Reasoning-v0.1
I hope this helps anyone looking to build reasoning models in their language.
updated
a model
2 days ago
dbands/Qwen2.5-Coder-14B-Instruct-reason-gguf
updated
a model
2 days ago
dbands/Qwen2.5-Coder-14B-Instruct-reason
Organizations
dbands's activity
Balanced data set?
#2 opened 5 months ago
by
dbands
![](https://cdn-avatars.huggingface.co/v1/production/uploads/5f5e525b3c67af20d9945a1b/iiVSPAhZ11S6W7v_TCKxJ.jpeg)
Thank you for making my model available.
#1 opened 6 months ago
by
dbands
![](https://cdn-avatars.huggingface.co/v1/production/uploads/5f5e525b3c67af20d9945a1b/iiVSPAhZ11S6W7v_TCKxJ.jpeg)
Running model on hugginface Inference end points
1
#3 opened 11 months ago
by
dbands
![](https://cdn-avatars.huggingface.co/v1/production/uploads/5f5e525b3c67af20d9945a1b/iiVSPAhZ11S6W7v_TCKxJ.jpeg)
Disabled autocast
9
#109 opened 12 months ago
by
miguelcarv
Would you live in one of these if you decided to adopt the tiny house lifestyle?
2
#7070 opened over 2 years ago
by
facelesswoman
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1659043039494-62d76cf8d68250349291ae7f.jpeg)
Futuristic Hell Post Apocalyptic City
1
#7008 opened over 2 years ago
by
SlaneCarsin