File size: 6,285 Bytes

4e33c24
 
 
 
 
 
 
2f56d9b
 
4e33c24
 
e7456bb
4e33c24
4525e8d
ebb19ea
61c13dd
2efc6df
476eb79
4e33c24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dcdc4aa
2499789
 
 
dcdc4aa
 
2499789
468b64f
 
2499789
4e33c24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fa4f53e
4e33c24
 
 
fa4f53e
 
 
4e33c24
fa4f53e
 
 
 
 
 
 
 
 
 
4e33c24
fa4f53e
4e33c24
 
9199b69
 
d27118d
4e33c24
1ab3caa
9199b69
4e33c24
 
20bc337

---
base_model:
- Sao10K/Fimbulvetr-11B-v2
library_name: transformers
tags:
- mergekit
- merge
license: apache-2.0
pipeline_tag: text-generation
---

![cute](https://huggingface.co./matchaaaaa/Chaifighter-20B-v2/resolve/main/chaifighter-cute.png)

**Thank you @brooketh for the [iMat + static GGUFs](https://huggingface.co./FaradayDotDev/Chaifighter-20B-v2-GGUF) on the Faraday model hub!**

**Thank you @mradermacher for also making [GGUFs](https://huggingface.co./mradermacher/Chaifighter-20B-v2-GGUF) and [iMat GGUFs](https://huggingface.co./mradermacher/Chaifighter-20B-v2-i1-GGUF)**

# Chaifighter 20B v2 (aaaaand it's BASICALLY a 20B this time!)

Meet Chaifighter 20B v2, my flagship Mistral 20B frankenmerge model! Boasting creativity, coherence, and cognitive thinking, this model is a great pick for those awkwardly stuck between 13B's and 34B's. 

I also wanted to provide an alternative to Jeb Carter's [Psyonic Cetacean 20B](https://huggingface.co./jebcarter/psyonic-cetacean-20B), which is a fantastic model that you should check out if you haven't already! The issue with that model is that it's based on Llama 2, which is outdated now. The older architecture lacked many performance enhancements that were introduced by the Mistral architecture, and on my 16 GB RTX 4060 Ti, those performance enhancements were the difference between decently speedy and intolerably sluggish.

Chaifighter 20B is geared towards long-form roleplay chats rather than short-form IRC/Discord RP chats. It loves verbosity and detail, and its quality will depend on how much "ammunition" you can give it. While it sorta-kinda can do short-form with some swiping, it isn't really ideal. But for those essay-writing powerhouses that love typing up a storm in the character card, this one's for you.  

Chaifighter 20B natively supports a context window of only 4096 tokens maximum.

## Prompt Template: Alpaca 

```
Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:
```

## Recommended Settings: Universal-Light

Here are some settings ranges that tend to work for me. They aren't strict values, and there's a bit of leeway in them. Feel free to experiment a bit!

* Temperature:        **1.0** *to* **1.25** (adjust to taste, but keep it low. Chaifighter is creative enough on its own)
* Min-P:              **0.1** (increasing might help if it goes cuckoo, but I suggest keeping it there)
* Repetition Penalty: **1.05** *to* **1.1** (high values aren't needed and usually degrade output)
* Rep. Penalty Range: **256** *or* **512**
* *(all other samplers disabled)*

## Merge Details

### Mergekit

Chaifighter 20B is a frankenmerge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

### Merge Method

This model was merged using the passthrough merge method.

### Models Merged

The following models were included in the merge:
* [Gryphe/MythoMist-7b](https://huggingface.co./Gryphe/MythoMist-7b)
* [KatyTheCutie/LemonadeRP-4.5.3](https://huggingface.co./KatyTheCutie/LemonadeRP-4.5.3)
* [SanjiWatsuki/Kunoichi-7B](https://huggingface.co./SanjiWatsuki/Kunoichi-7B)
* [Sao10K/Fimbulvetr-11B-v2](https://huggingface.co./Sao10K/Fimbulvetr-11B-v2)

### The Sauceeeeeee e ee 

The following YAML configuration was used to produce this model:

```yaml
slices:
  - sources:
    - model: Sao10K/Fimbulvetr-11B-v2
      layer_range: [0, 40]
  - sources:
    - model: SanjiWatsuki/Kunoichi-7B
      layer_range: [8, 16]
  - sources:
    - model: Mytho-Lemon-11B # my own merge (see below).
      layer_range: [8, 48]
merge_method: passthrough
dtype: bfloat16
```

And here's Mytho-Lemon-11B. Yep, named it backwards.

```yaml
slices:
  - sources:
    - model: KatyTheCutie/LemonadeRP-4.5.3
      layer_range: [0, 24]
  - sources:
    - model: Gryphe/MythoMist-7B # manually added tokenizer files
      layer_range: [8, 32]
merge_method: passthrough
dtype: bfloat16
```

It's a lot better than v1 :skull:

So, the idea was to start with Fimbulvetr-11B-v2, a super solid RP model that punches wayyy above its weight especially for its coherence, reasoning, and even spatial awareness. Keeping the layers intact apparently is somewhat unusual, but I wanted to keep it closest to the input layers. I thought it would improve logic and open the door for more creativity later in the stack. I added Kunoichi next for its context and instruction following skills. This worked very well in v1. Lastly, I used a frankenmerge of MythoMist and LemonadeRP for the last layers. These are pretty creative models with solid writing. MythoMist in theory would give the model flavor and verbosity. LemonadeRP was recommended by a friend, and I thought it really complimented the rest of the mix quite nicely! 

## Thanks and Other Stuff

I want to thank everyone who helped me make this model. [@brooketh](https://huggingface.co./brooketh), [@FallenMerick](https://huggingface.co./FallenMerick), [@jebcarter](https://huggingface.co./jebcarter), [@Qonsol](https://huggingface.co./Qonsol), [@PacmanIncarnate](https://huggingface.co./PacmanIncarnate), and many others: thank you so much. Without the help, feedback, and encouragement these people gave, Chaifighter v2 would not have happened. The flaws in v1 were numerous and tricky to solve, especially for someone still super new to this (me). I don't know what I'd do without these kindhearted and generous people! 

Yapping time. As far as the name is concerned, I'm going for a tea/coffee/hot drink motif for my models, and one of the names I was debating on using for this model was Chai-Latte. As I worked on this merge, I got the idea of naming it "Chaifighter" as a play on "Psyfighter2", one of the models making up Psyonic Cetacean and also a play on a model called "Tiefighter" from which it was derived. Both are fantastic models, especially given their age. They're both worth checking out too if you haven't done so. "Chai" itself is a play on a certain AI chatting website (CAI) that got me into this lovely mess in the first place. So I guess it's fitting to name the first model of the series after it.

And lastly, of course, thank you for checking out my model! Remember that you're super amazing, and have a fantastic day! :)