Transformers
GGUF
Inference Endpoints
conversational
emnakamura commited on
Commit
2222dbd
·
verified ·
1 Parent(s): 8bbe913

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ Mahou-1.2b-mistral-7B-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
37
+ Mahou-1.2b-mistral-7B-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
38
+ Mahou-1.2b-mistral-7B-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
39
+ Mahou-1.2b-mistral-7B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
Mahou-1.2b-mistral-7B-Q3_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:875c043a6f6d57feee1db755a123c609d758261bb3f78b5585ecce509d4b55c6
3
+ size 3518986240
Mahou-1.2b-mistral-7B-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:25b94e84852db5f9be9fbb1393d4e2451e8ea5744527f527da8ac0f12f86d4b5
3
+ size 4368439296
Mahou-1.2b-mistral-7B-Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f7f3a16a22eb2a650b9f9daee80e38fcb92a2ac53a58c7ceed4eb56f1929992d
3
+ size 5131409408
Mahou-1.2b-mistral-7B-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b6a98f2f464645016848d3f923609fe428b9862899040fb1a2e1e23cbbd29f22
3
+ size 7695857664
README.md ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ base_model:
5
+ - nbeerbower/Mahou-1.2a-mistral-7B
6
+ datasets:
7
+ - flammenai/MahouMix-v1
8
+ - flammenai/FlameMix-DPO-v1
9
+ ---
10
+ ![image/png](https://huggingface.co/flammenai/Mahou-1.0-mistral-7B/resolve/main/mahou1.png)
11
+
12
+ # Mahou-1.2b-mistral-7B
13
+
14
+ Mahou is designed to provide short messages in a conversational context. It is capable of casual conversation and character roleplay.
15
+
16
+ ### Chat Format
17
+
18
+ This model has been trained to use ChatML format.
19
+
20
+ ```
21
+ <|im_start|>system
22
+ {{system}}<|im_end|>
23
+ <|im_start|>{{char}}
24
+ {{message}}<|im_end|>
25
+ <|im_start|>{{user}}
26
+ {{message}}<|im_end|>
27
+ ```
28
+
29
+ ### Roleplay Format
30
+
31
+ - Speech without quotes.
32
+ - Actions in `*asterisks*`
33
+
34
+ ```
35
+ *leans against wall cooly* so like, i just casted a super strong spell at magician academy today, not gonna lie, felt badass.
36
+ ```
37
+
38
+ ### SillyTavern Settings
39
+
40
+ 1. Use ChatML for the Context Template.
41
+ 2. Enable Instruct Mode.
42
+ 3. Use the [Mahou preset](https://huggingface.co/datasets/flammenai/Mahou-ST-ChatML-Instruct/raw/main/Mahou.json).
43
+ 4. *Recommended* Additonal stopping strings: `["\n", "<|", "</"]`
44
+
45
+ ### Method
46
+
47
+ DPO finetuned using an A100 on Google Colab.
48
+
49
+ [Fine-tune a Mistral-7b model with Direct Preference Optimization](https://towardsdatascience.com/fine-tune-a-mistral-7b-model-with-direct-preference-optimization-708042745aac) - [Maxime Labonne](https://huggingface.co/mlabonne)
50
+
51
+ ### Configuration
52
+
53
+ LoRA, model, and training settings:
54
+
55
+ ```python
56
+ # LoRA configuration
57
+ peft_config = LoraConfig(
58
+ r=16,
59
+ lora_alpha=16,
60
+ lora_dropout=0.05,
61
+ bias="none",
62
+ task_type="CAUSAL_LM",
63
+ target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']
64
+ )
65
+ # Model to fine-tune
66
+ model = AutoModelForCausalLM.from_pretrained(
67
+ model_name,
68
+ torch_dtype=torch.bfloat16,
69
+ load_in_4bit=True
70
+ )
71
+ model.config.use_cache = False
72
+ # Reference model
73
+ ref_model = AutoModelForCausalLM.from_pretrained(
74
+ model_name,
75
+ torch_dtype=torch.bfloat16,
76
+ load_in_4bit=True
77
+ )
78
+ # Training arguments
79
+ training_args = TrainingArguments(
80
+ per_device_train_batch_size=4,
81
+ gradient_accumulation_steps=4,
82
+ gradient_checkpointing=True,
83
+ learning_rate=5e-5,
84
+ lr_scheduler_type="cosine",
85
+ max_steps=2000,
86
+ save_strategy="no",
87
+ logging_steps=1,
88
+ output_dir=new_model,
89
+ optim="paged_adamw_32bit",
90
+ warmup_steps=100,
91
+ bf16=True,
92
+ report_to="wandb",
93
+ )
94
+ # Create DPO trainer
95
+ dpo_trainer = DPOTrainer(
96
+ model,
97
+ ref_model,
98
+ args=training_args,
99
+ train_dataset=dataset,
100
+ tokenizer=tokenizer,
101
+ peft_config=peft_config,
102
+ beta=0.1,
103
+ force_use_ref_model=True
104
+ )
105
+ # Fine-tune model with DPO
106
+ dpo_trainer.train()
107
+ ```