lesso06 commited on
Commit
1739587
·
verified ·
1 Parent(s): da6a924

End of training

Browse files
Files changed (1) hide show
  1. README.md +0 -76
README.md DELETED
@@ -1,76 +0,0 @@
1
- ---
2
- library_name: peft
3
- license: apache-2.0
4
- base_model: unsloth/Qwen2-1.5B-Instruct
5
- tags:
6
- - axolotl
7
- - generated_from_trainer
8
- model-index:
9
- - name: 2633c5c4-0396-4f8a-a5ec-df27508e79f2
10
- results: []
11
- ---
12
-
13
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
- should probably proofread and complete it, then remove this comment. -->
15
-
16
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
17
- <br>
18
-
19
- # 2633c5c4-0396-4f8a-a5ec-df27508e79f2
20
-
21
- This model is a fine-tuned version of [unsloth/Qwen2-1.5B-Instruct](https://huggingface.co/unsloth/Qwen2-1.5B-Instruct) on the None dataset.
22
- It achieves the following results on the evaluation set:
23
- - Loss: 1.0544
24
-
25
- ## Model description
26
-
27
- More information needed
28
-
29
- ## Intended uses & limitations
30
-
31
- More information needed
32
-
33
- ## Training and evaluation data
34
-
35
- More information needed
36
-
37
- ## Training procedure
38
-
39
- ### Training hyperparameters
40
-
41
- The following hyperparameters were used during training:
42
- - learning_rate: 0.000206
43
- - train_batch_size: 4
44
- - eval_batch_size: 4
45
- - seed: 42
46
- - gradient_accumulation_steps: 2
47
- - total_train_batch_size: 8
48
- - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
49
- - lr_scheduler_type: cosine
50
- - lr_scheduler_warmup_steps: 50
51
- - training_steps: 500
52
-
53
- ### Training results
54
-
55
- | Training Loss | Epoch | Step | Validation Loss |
56
- |:-------------:|:------:|:----:|:---------------:|
57
- | No log | 0.0008 | 1 | 1.1879 |
58
- | 1.0481 | 0.0410 | 50 | 1.1445 |
59
- | 1.0289 | 0.0821 | 100 | 1.1724 |
60
- | 1.0706 | 0.1231 | 150 | 1.1269 |
61
- | 0.9754 | 0.1641 | 200 | 1.1136 |
62
- | 1.1411 | 0.2052 | 250 | 1.0916 |
63
- | 1.0608 | 0.2462 | 300 | 1.0737 |
64
- | 1.0404 | 0.2872 | 350 | 1.0615 |
65
- | 0.9828 | 0.3283 | 400 | 1.0560 |
66
- | 0.9821 | 0.3693 | 450 | 1.0554 |
67
- | 1.0891 | 0.4103 | 500 | 1.0544 |
68
-
69
-
70
- ### Framework versions
71
-
72
- - PEFT 0.13.2
73
- - Transformers 4.46.0
74
- - Pytorch 2.5.0+cu124
75
- - Datasets 3.0.1
76
- - Tokenizers 0.20.1