metadata

language:
  - en
license: cc-by-nc-4.0
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - mistral
  - trl
  - sft
  - Roleplay
  - roleplay
base_model: SanjiWatsuki/Kunoichi-DPO-v2-7B

Still in experiment

About this model

Remake version 2 with safetensor format, more safety and stable method, nothing change too much (base on the model hash). But to be real, in the previous version 2, I used unsafety method to save pretrain model, which could lead apply Lora layer twice to model, that make model have terrible performance. (Thanks Unsloth community told me about this :D )

Finetuned with rough translate dataset, to increase the accuracy in TSF theme, which is not quite popular. (lewd dataset)
Finetuned from model : SanjiWatsuki/Kunoichi-DPO-v2-7B . Thank SanjiWatsuki a lot :)

GGUF version? Here. Thank you, mradermacher!

V2 have more epochs.

Dataset

Dataset(all are novels):
30% skinsuit
30% possession
35% transform(shapeshift)
5% other

Thank Unsloth for good finetuning tool. This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.