Rauhan
/

llama-3.2-3B-GRPO-GSM325

Text Generation

reinforcement-learning

mathematical-reasoning

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

llama-3.2-3B-GRPO-GSM325

2 contributors

History: 11 commits

Rauhan's picture

SFconvertbot's picture

Adding `safetensors` variant of this model (#2)

269edac verified about 10 hours ago

.gitattributes

1.57 kB

Upload tokenizer about 14 hours ago
README.md

4.04 kB

Update README.md about 10 hours ago
config.json

991 Bytes

Trained with Unsloth about 14 hours ago
generation_config.json

166 Bytes

Trained with Unsloth about 14 hours ago
model-00001-of-00002.safetensors

4.97 GB
LFS

Adding `safetensors` variant of this model (#2) about 10 hours ago
model-00002-of-00002.safetensors

1.46 GB
LFS

Adding `safetensors` variant of this model (#2) about 10 hours ago
model.safetensors.index.json

21.9 kB

Adding `safetensors` variant of this model (#2) about 10 hours ago
pytorch_model-00001-of-00002.bin
Detected Pickle imports (3)
- "torch.HalfStorage",
- "torch._utils._rebuild_tensor_v2",
- "collections.OrderedDict"
What is a pickle import?
4.97 GB
LFS

Trained with Unsloth about 14 hours ago
pytorch_model-00002-of-00002.bin
Detected Pickle imports (3)
- "torch.HalfStorage",
- "torch._utils._rebuild_tensor_v2",
- "collections.OrderedDict"
What is a pickle import?
1.46 GB
LFS

Trained with Unsloth about 14 hours ago
pytorch_model.bin.index.json

20.9 kB

Trained with Unsloth about 14 hours ago
special_tokens_map.json

454 Bytes

Upload tokenizer about 14 hours ago
tokenizer.json

17.2 MB
LFS

Upload tokenizer about 14 hours ago
tokenizer_config.json

54.7 kB

Upload tokenizer about 14 hours ago