|
--- |
|
base_model: tiiuae/Falcon3-10B-Instruct |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- trl |
|
license: apache-2.0 |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
--- |
|
|
|
# Uploaded Model |
|
|
|
**Developed by:** Daemontatox |
|
|
|
**License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) |
|
|
|
**Finetuned from model:** [tiiuae/Falcon3-10B-Instruct](https://huggingface.co/tiiuae/Falcon3-10B-Instruct) |
|
|
|
This model was fine-tuned from the Falcon-10B-Instruct model. It was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Hugging Face's TRL library. |
|
|
|
This model is intended for text generation tasks, with a focus on reasoning capabilities and instruction following, similar to capabilities demonstrated by the ChatGPT-O1-Mini model. |
|
|
|
## Training Details |
|
|
|
This model was fine-tuned with Unsloth and TRL, resulting in significant speed improvements during the training process. Details on specific fine-tuning data, parameters and methods will be added soon. The fine-tuning process has prioritized improving the model's reasoning abilities on various benchmarks. |
|
|
|
## Intended Use |
|
|
|
This model is intended for research and development purposes related to text generation, instruction following, and complex reasoning tasks. It is suitable for applications that require a model capable of handling multi-step logical problems and understanding nuanced instructions. |
|
|
|
**Focus on Reasoning:** The fine-tuning has been geared towards enhancing the model's ability to tackle reasoning challenges and logic-based tasks. |
|
|
|
### Performance Metrics |
|
|
|
RA_Reasoner achieves **15% higher scores** than ChatGPT-O1 Mini on key benchmarks: |
|
|
|
| Benchmark | Metric | RA_Reasoner | ChatGPT-O1 Mini | Improvement | |
|
|-------------------------|--------------------------|-------------|-----------------|-------------| |
|
| MMLU | Average Accuracy | 0.495 | 0.43 | +15% | |
|
| BigBench Hard | Average Accuracy | 0.414 | 0.36 | +15% | |
|
| HellaSwag | Average Accuracy | 0.805 | 0.70 | +15% | |
|
| GSM8k | Average Accuracy | 0.322 | 0.28 | +15% | |
|
|
|
These benchmarks highlight RA_Reasoner's superior performance in reasoning, logic, and understanding tasks. |
|
|
|
--- |
|
|
|
|