SanjiWatsuki
/

Kunoichi-DPO-7B

+---
+license: cc-by-nc-4.0
+---
+![image/png](https://huggingface.co/SanjiWatsuki/Kunoichi-7B/resolve/main/assets/kunoichi.png)
+<!-- description start -->
+## Description
+This repository hosts **Kunoichi-DPO-7B**, a DPO finetune using Intel's Orca pairs with the Alpaca template on Kunoichi-7B. This model is targeted at general use. In my testing, it has stronger reasoning and instruction following capabilities than Kunoichi-7B but it may be worse for roleplaying purposes due to the alignment from the Orca dataset.
+This model is undergoing benchmark testing and I will update the model page with the finalized results.
+| Model                | MT Bench | EQ Bench | MMLU   | Logic Test |
+|----------------------|----------|----------|---------|-------------|
+| GPT-4-Turbo         | 9.32     | -        | -       | -           |
+| GPT-4               | 8.99     | 62.52    | 86.4    | 0.86        |
+| **Kunoichi-DPO-7B** | **8.29**     | **41.60**    | -    | **0.59**        |
+| **Kunoichi-7B**     | **8.14**     | **44.32**    | **64.9**    | **0.58**            |
+| Starling-7B         | 8.09     | -        | 63.9    | 0.51        |
+| Claude-2            | 8.06     | 52.14    | 78.5    | -           |
+| Silicon-Maid-7B     | 7.96     | 40.44    | 64.7    | 0.54           |
+| Loyal-Macaroni-Maid-7B | 7.95     | 38.66    | 64.9   | 0.57        |
+| GPT-3.5-Turbo       | 7.94     | 50.28    | 70      | 0.57        |
+| Claude-1            | 7.9       | -        | 77      | -           |
+| Openchat-3.5        | 7.81     | 37.08    | 64.3    | 0.39        |
+| Dolphin-2.6-DPO     | 7.74     | 42.88    | 61.9    | 0.53        |
+| Zephyr-7B-beta      | 7.34     | 38.71    | 61.4    | 0.30        |
+| Llama-2-70b-chat-hf | 6.86     | 51.56    | 63      | -           |
+| Neural-chat-7b-v3-1 | 6.84     | 43.61    | 62.4    | 0.30        |
+| Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
+|---|---:|---:|---:|---:|---:|
+| [**Kunoichi-DPO-7B**]|-|  -|  -|     -|   -|
+| [Kunoichi-7B](https://huggingface.co/SanjiWatsuki/Kunoichi-7B)|57.54|  44.99|  74.86|     63.72|   46.58|
+| [OpenPipe/mistral-ft-optimized-1218](https://huggingface.co/OpenPipe/mistral-ft-optimized-1218)| 56.85 | 44.74 | 75.6 | 59.89 | 47.17 |
+| [Silicon-Maid-7B](https://huggingface.co/SanjiWatsuki/Silicon-Maid-7B) | 56.45|  44.74|  74.26|      61.5|   45.32|
+| [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B) | 53.51 | 43.67 | 73.24 | 55.37 | 41.76 |
+| [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)  | 52.42 | 42.75 | 72.99 | 52.99 | 40.94 |
+| [openchat/openchat_3.5](https://huggingface.co/openchat/openchat_3.5) | 51.34 | 42.67 | 72.92 | 47.27 | 42.51 |
+| [berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha) | 51.16 | 42.06 | 72.72 | 47.33 | 42.53 |
+| [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) | 50.99 | 37.33 | 71.83 | 55.1 | 39.7 |
+The model is intended to be used with up to an 8k context window. Using a NTK RoPE alpha of 2.6, the model can be used experimentally up to a 16k context window.
+<!-- description end -->
+<!-- prompt-template start -->
+## Prompt template: Custom format, or Alpaca
+### Alpaca:
+```
+Below is an instruction that describes a task. Write a response that appropriately completes the request.
+### Instruction:
+{prompt}
+### Response:
+```
+### SillyTavern format:
+I found the best SillyTavern results from using the Noromaid template.
+SillyTavern config files: [Context](https://files.catbox.moe/ifmhai.json), [Instruct](https://files.catbox.moe/ttw1l9.json).
+Additionally, here is my highly recommended [Text Completion preset](https://huggingface.co/SanjiWatsuki/Loyal-Macaroni-Maid-7B/blob/main/Characters/MinP.json). You can tweak this by adjusting temperature up or dropping min p to boost creativity or raise min p to increase stability. You shouldn't need to touch anything else!