ae8dd91a-3a8d-4629-a678-b204a563ff34

This model is a fine-tuned version of katuni4ka/tiny-random-olmo-hf on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.000207
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 50
training_steps: 500

Training Loss	Epoch	Step	Validation Loss
No log	0.0006	1	10.8262
10.6287	0.0302	50	10.6009
10.4142	0.0603	100	10.4306
10.3647	0.0905	150	10.3617
10.2791	0.1206	200	10.3215
10.2734	0.1508	250	10.2990
10.2498	0.1809	300	10.2861
10.2523	0.2111	350	10.2775
10.2311	0.2413	400	10.2728
10.2307	0.2714	450	10.2708
10.2521	0.3016	500	10.2705