Gilbert Bands's picture

Gilbert Bands PRO

dbands

·

https://www.linkedin.com/in/deon-bands-business-architect/

dbands

AI & ML interests

None yet

Recent Activity

replied to s-emanuilov's post 2 days ago

Tutorial 💥 Training a non-English reasoning model with GRPO and Unsloth I wanted to share my experiment with training reasoning models in languages other than English/Chinese. Using Llama 3.1 8B as base, GRPO trainer from trl, and Unsloth optimizations, I got a working prototype in Bulgarian after ~5 hours on an L40S GPU. The approach should work for any language where the base model has some pre-training coverage. Full code and tutorial here: https://unfoldai.com/reasoning-in-a-non-english-language/ The model itself: https://huggingface.co/s-emanuilov/LLMBG-Llama-3.1-8B-BG-Reasoning-v0.1 I hope this helps anyone looking to build reasoning models in their language.

updated a model 2 days ago

dbands/Qwen2.5-Coder-14B-Instruct-reason-gguf

updated a model 2 days ago

dbands/Qwen2.5-Coder-14B-Instruct-reason

View all activity

Organizations

dbands's activity

New activity in Afterglow777/chemical_dpo_exp_dataset 5 months ago

Balanced data set?

#2 opened 5 months ago by

New activity in RichardErkhov/dbands_-_ChemWiz_16bit-gguf 6 months ago

Thank you for making my model available.

#1 opened 6 months ago by

New activity in rombodawg/Everyone-Coder-33b-Base 11 months ago

Running model on hugginface Inference end points

#3 opened 11 months ago by

New activity in microsoft/phi-2 12 months ago

Disabled autocast

#109 opened 12 months ago by

New activity in dalle-mini/dalle-mini over 2 years ago

Would you live in one of these if you decided to adopt the tiny house lifestyle?

#7070 opened over 2 years ago by

Futuristic Hell Post Apocalyptic City

#7008 opened over 2 years ago by