Post
1467
Blurred-Thoughts Supervised-Finetuning 🙈
After hours of working with GitHub Copilot to organize the code, I'm keen to announce the release of Blurred Thoughts Supervised-Finetuning (BT-SFT), a new method for fine-tuning LLMs to produce more diverse and creative responses.
BT-SFT introduces:
✅ Smart tokenization method randomly masks tokens within <think> ... </think> tags, promoting the model to generate diverse responses that align better with its probability distribution instead of memorizing the thought process from distilled data.
✅ Reward function that ensures responses are well-structured.
Explore and contribute to the project available in my GitHub repository:
https://github.com/mkurman/blurred-thoughts-SFT
Keep me updated on your experiments with BT-SFT! 🐐
After hours of working with GitHub Copilot to organize the code, I'm keen to announce the release of Blurred Thoughts Supervised-Finetuning (BT-SFT), a new method for fine-tuning LLMs to produce more diverse and creative responses.
BT-SFT introduces:
✅ Smart tokenization method randomly masks tokens within <think> ... </think> tags, promoting the model to generate diverse responses that align better with its probability distribution instead of memorizing the thought process from distilled data.
✅ Reward function that ensures responses are well-structured.
Explore and contribute to the project available in my GitHub repository:
https://github.com/mkurman/blurred-thoughts-SFT
Keep me updated on your experiments with BT-SFT! 🐐