--- base_model: [] library_name: transformers tags: - mergekit - merge --- Use ChatML or MistralNemo format. Conclusion: These types of merge methods tend to work better when at least 1 model has a much higher weight then the rest After further testing this is the best Nemo model I have ever used ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: mistral-nemo-gutenberg-12B-v4 parameters: weight: 0.2 - model: Violet_Twilight-v0.2 parameters: weight: 0.3 - model: Lyra-Gutenberg-mistral-nemo-12B parameters: weight: 0.5 - model: Grey-12b parameters: weight: 0.2 base_model: Mistral-Nemo-Base-2407 parameters: density: 0.5 epsilon: 0.1 lambda: 1.1 normalize: false int8_mask: true rescale: true merge_method: della_linear tokenizer: source: union dtype: bfloat16 ```