Edit model card

cute

Chaifighter 20B v2.1 (now with native 8K context!)

Requested by @dazl1212!

Meet Chaifighter 20B v2.1, my flagship Mistral 20B frankenmerge model! Boasting creativity, coherence, and cognitive thinking, this model is a great pick for those awkwardly stuck between 13B's and 34B's.

I also wanted to provide an alternative to Jeb Carter's Psyonic Cetacean 20B, which is a fantastic model that you should check out if you haven't already! The issue with that model is that it's based on Llama 2, which is outdated now. The older architecture lacked many performance enhancements that were introduced by the Mistral architecture, and on my 16 GB RTX 4060 Ti, those performance enhancements were the difference between decently speedy and intolerably sluggish.

Chaifighter 20B is geared towards long-form roleplay chats rather than short-form IRC/Discord RP chats. It loves verbosity and detail, and its quality will depend on how much "ammunition" you can give it. While it sorta-kinda can do short-form with some swiping, it isn't really ideal. But for those essay-writing powerhouses that love typing up a storm in the character card, this one's for you.

Chaifighter 20B v2.1 now natively supports a context window of 8192 tokens. AWE YEAH!!!

Note: since v2 was broken, v2.1 will suffer from the same issues.

Prompt Template: Alpaca

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{prompt}

### Response:

Recommended Settings: Universal-Light

Here are some settings ranges that tend to work for me. They aren't strict values, and there's a bit of leeway in them. Feel free to experiment a bit!

  • Temperature: 1.0 to 1.25 (adjust to taste, but keep it low. Chaifighter is creative enough on its own)
  • Min-P: 0.1 (increasing might help if it goes cuckoo, but I suggest keeping it there)
  • Repetition Penalty: 1.05 to 1.1 (high values aren't needed and usually degrade output)
  • Rep. Penalty Range: 256 or 512
  • (all other samplers disabled)

Merge Details

Chaifighter 20B is a frankenmerge of pre-trained language models created using mergekit.

Merge Method

This model was merged using the passthrough merge method.

Models Merged

The following models were included in the merge:

The Sauce

The following YAML configuration was used to produce this model:

slices:
  - sources:
    - model: KatyTheCutie/LemonadeRP-4.5.3
      layer_range: [0, 24]
  - sources:
    - model: Gryphe/MythoMist-7b # manually added tokenizer files
      layer_range: [8, 32]
merge_method: passthrough
dtype: float32
name: Mytho-Lemon-11B
---
slices:
  - sources:
    - model: Sao10K/Fimbulvetr-11B-v2.1-16K
      layer_range: [0, 40]
  - sources:
    - model: SanjiWatsuki/Kunoichi-7B
      layer_range: [8, 16]
  - sources:
    - model: Mytho-Lemon-11B
      layer_range: [8, 48]
merge_method: passthrough
dtype: float32
name: Chaifighter-20B-v2.1

It's just a quick tweak to the Chaifighter-20B-v2 recipe.

Thanks and Other Stuff

I want to thank everyone who helped me make this model. @brooketh, @FallenMerick, @jebcarter, @Qonsol, @PacmanIncarnate, and many others: thank you so much. Without the help, feedback, and encouragement these people gave, Chaifighter v2 would not have happened. The flaws in v1 were numerous and tricky to solve, especially for someone still super new to this (me). I don't know what I'd do without these kindhearted and generous people!

Thanks again to @dazl1212 for the request!

Yapping time. As far as the name is concerned, I'm going for a tea/coffee/hot drink motif for my models, and one of the names I was debating on using for this model was Chai-Latte. As I worked on this merge, I got the idea of naming it "Chaifighter" as a play on "Psyfighter2", one of the models making up Psyonic Cetacean and also a play on a model called "Tiefighter" from which it was derived. Both are fantastic models, especially given their age. They're both worth checking out too if you haven't done so. "Chai" itself is a play on a certain AI chatting website (CAI) that got me into this lovely mess in the first place. So I guess it's fitting to name the first model of the series after it.

And lastly, of course, thank you for checking out my model! Remember that you're super amazing, and have a fantastic day! :)

Downloads last month
13
Safetensors
Model size
19.5B params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from