Update README.md
Browse files
README.md
CHANGED
@@ -27,7 +27,11 @@ quantized_by: Suparious
|
|
27 |
- Model creator: [cognitivecomputations](https://huggingface.co/cognitivecomputations)
|
28 |
- Original model: [dolphin-2.9.4-gemma2-2b](https://huggingface.co/cognitivecomputations/dolphin-2.9.4-gemma2-2b)
|
29 |
|
|
|
30 |
|
|
|
|
|
|
|
31 |
|
32 |
## How to use
|
33 |
|
|
|
27 |
- Model creator: [cognitivecomputations](https://huggingface.co/cognitivecomputations)
|
28 |
- Original model: [dolphin-2.9.4-gemma2-2b](https://huggingface.co/cognitivecomputations/dolphin-2.9.4-gemma2-2b)
|
29 |
|
30 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png" width="600" />
|
31 |
|
32 |
+
This one is special because I used [GrokAdamW](https://github.com/cognitivecomputations/grokadamw) and [Liger Kernel](https://github.com/linkedin/Liger-Kernel)
|
33 |
+
|
34 |
+
GrokAdamW is intended to enable fast Grokking, to increase generalization. (I am not certain this occurred because this checkpoint is 4 epochs, and it probabaly take more epochs to achieve grok.)
|
35 |
|
36 |
## How to use
|
37 |
|