|
--- |
|
base_model: |
|
- FallenMerick/Chunky-Lemon-Cookie-11B |
|
- Sao10K/Fimbulvetr-11B-v2.1-16K |
|
- senseable/WestLake-7B-v2 |
|
base_model_relation: merge |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
- roleplay |
|
- text-generation-inference |
|
license: cc-by-4.0 |
|
--- |
|
|
|
![cute](https://huggingface.co./matchaaaaa/Honey-Yuzu-13B/resolve/main/honey-yuzu-cute.png) |
|
|
|
**Thank you [@Brooketh](https://huggingface.co./brooketh) for the [GGUFs](https://huggingface.co./backyardai/Honey-Yuzu-13B-GGUF)!!** |
|
|
|
# Honey-Yuzu-13B |
|
|
|
Meet Honey-Yuzu, a sweet lemony tea brewed by yours truly! A bit of [Chunky-Lemon-Cookie-11B](https://huggingface.co./FallenMerick/Chunky-Lemon-Cookie-11B) here for its great flavor, with a dash of [WestLake-7B-v2](https://huggingface.co./senseable/WestLake-7B-v2) there to add some depth. I'm really proud of how it turned out, and I hope you like it too! |
|
|
|
It's not as verbose as Chaifighter, but it still writes very well. It boasts fantastic coherence and character understanding (in my opinion) for a 13B, and it's been my daily driver for a little bit. It's a solid RP model that should generally play nice with just about anything. |
|
|
|
**Native Context Length: 8K/8192** *(can be extended using RoPE, possibly past 16K)* |
|
|
|
## Prompt Template: Alpaca |
|
|
|
``` |
|
Below is an instruction that describes a task. Write a response that appropriately completes the request. |
|
|
|
### Instruction: |
|
{prompt} |
|
|
|
### Response: |
|
``` |
|
|
|
## Recommended Settings: Universal-Light |
|
|
|
Here are some settings ranges that tend to work for me. They aren't strict values, and there's a bit of leeway in them. Feel free to experiment a bit! |
|
|
|
* Temperature: **1.0** to **1.25** |
|
* Min-P: **0.05** to **0.1** |
|
* Repetition Penalty: **1.05** *to* **1.1** (high values aren't needed and usually degrade output) |
|
* Rep. Penalty Range: **256** *or* **512** |
|
* *(all other samplers disabled)* |
|
|
|
## The Deets |
|
|
|
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). |
|
|
|
### Merge Method |
|
|
|
This model was merged using the passthrough merge method. |
|
|
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* [Chunky-Lemon-Cookie-11B](https://huggingface.co./FallenMerick/Chunky-Lemon-Cookie-11B) |
|
* [SanjiWatsuki/Kunoichi-7B](https://huggingface.co./SanjiWatsuki/Kunoichi-7B) |
|
* [SanjiWatsuki/Silicon-Maid-7B](https://huggingface.co./SanjiWatsuki/Silicon-Maid-7B) |
|
* [KatyTheCutie/LemonadeRP-4.5.3](https://huggingface.co./KatyTheCutie/LemonadeRP-4.5.3) |
|
* [Fimbulvetr-11B-v2.1-16K](https://huggingface.co./Sao10K/Fimbulvetr-11B-v2.1-16K) |
|
* [mistralai/Mistral-7B-v0.1](https://huggingface.co./mistralai/Mistral-7B-v0.1) |
|
* [senseable/WestLake-7B-v2](https://huggingface.co./senseable/WestLake-7B-v2) |
|
|
|
### The Special Sauce |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
slices: # this is a quick float32 restack of BLC using the OG recipe |
|
- sources: |
|
- model: SanjiWatsuki/Kunoichi-7B |
|
layer_range: [0, 24] |
|
- sources: |
|
- model: SanjiWatsuki/Silicon-Maid-7B |
|
layer_range: [8, 24] |
|
- sources: |
|
- model: KatyTheCutie/LemonadeRP-4.5.3 |
|
layer_range: [24, 32] |
|
merge_method: passthrough |
|
dtype: float32 |
|
name: Big-Lemon-Cookie-11B |
|
--- |
|
models: # this is a remake of CLC with the newer Fimbul v2.1 version |
|
- model: Big-Lemon-Cookie-11B |
|
parameters: |
|
weight: 0.85 |
|
- model: Sao10K/Fimbulvetr-11B-v2.1-16K |
|
parameters: |
|
weight: 0.15 |
|
merge_method: linear |
|
dtype: float32 |
|
name: Chunky-Lemon-Cookie-11B |
|
--- |
|
slices: # 8 layers of WL for the splice |
|
- sources: |
|
- model: senseable/WestLake-7B-v2 |
|
layer_range: [8, 16] |
|
merge_method: passthrough |
|
dtype: float32 |
|
name: WL-splice |
|
--- |
|
slices: # 8 layers of CLC for the splice |
|
- sources: |
|
- model: Chunky-Lemon-Cookie-11B |
|
layer_range: [8, 16] |
|
merge_method: passthrough |
|
dtype: float32 |
|
name: CLC-splice |
|
--- |
|
models: # this is the splice, a gradient merge meant to gradually and smoothly interpolate between stacks of different models |
|
- model: WL-splice |
|
parameters: |
|
weight: [1, 1, 0.75, 0.625, 0.5, 0.375, 0.25, 0, 0] # 0.125 / 0.875 values removed here - "math gets screwy" |
|
- model: CLC-splice |
|
parameters: |
|
weight: [0, 0, 0.25, 0.375, 0.5, 0.625, 0.75, 1, 1] # 0.125 / 0.875 values removed here - "math gets screwy" |
|
merge_method: dare_linear # according to some paper, "DARE is all you need" |
|
base_model: WL-splice |
|
dtype: float32 |
|
name: splice |
|
--- |
|
slices: # putting it all together |
|
- sources: |
|
- model: senseable/WestLake-7B-v2 |
|
layer_range: [0, 16] |
|
- sources: |
|
- model: splice |
|
layer_range: [0, 8] |
|
- sources: |
|
- model: Chunky-Lemon-Cookie-11B |
|
layer_range: [16, 48] |
|
merge_method: passthrough |
|
dtype: float32 |
|
name: Honey-Yuzu-13B |
|
``` |
|
|
|
### The Thought Process |
|
|
|
This was meant to be a simple RP-focused merge. I chose 2 well-performing RP models - [Chunky-Lemon-Cookie-11B](https://huggingface.co./FallenMerick/Chunky-Lemon-Cookie-11B) by [FallenMerick](https://huggingface.co./FallenMerick) and [WestLake-7B-v2](https://huggingface.co./senseable/WestLake-7B-v2) by [senseable](https://huggingface.co./senseable) - and merge them using a more conventional configuration (okay, okay, a 56 layer 12.5B Mistral isn't that conventional but still) rather than trying something wild or crazy and pushing the limits. I was very pleased with the results, but I wanted to see what would happen if I remade CLC with [Fimbulvetr-11B-v2.1-16K](https://huggingface.co./Sao10K/Fimbulvetr-11B-v2.1-16K) by [Sao10K](https://huggingface.co./Sao10K). This resulted in equally nice (if not slightly better) outputs but greatly improved native context length. |
|
|
|
|
|
|
|
Have feedback? Comments? Questions? Don't hesitate to let me know! As always, have a wonderful day, and please be nice to yourself! :) |