File size: 1,011 Bytes
f65da4f
 
 
a3c2667
8c6855e
f65da4f
 
31fab84
2ebd3a9
31fab84
2ebd3a9
31fab84
2ebd3a9
31fab84
2ebd3a9
31fab84
86088f5
31fab84
2ebd3a9
31fab84
ca3882e
31fab84
ca3882e
31fab84
ca3882e
31fab84
ca3882e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
---
datasets:
  - PygmalionAI/PIPPA
  - ludis/geepeetee4
  - lemonilia/LimaRP
---

## Prompting

https://rentry.org/tsukasa13b - reccomended prompts and gen settings

The current model version has been trained on prompts using three different roles, which are denoted by the following tokens: `<|system|>`, `<|user|>` and `<|model|>`.

The `<|system|>` prompt can be used to inject out-of-channel information behind the scenes, while the `<|user|>` prompt should be used to indicate user input. The `<|model|>` token should then be used to indicate that the model should generate a response. These tokens can happen multiple times and be chained up to form a conversation history.

## Training

base model (llama-2-13b-hf)

tuned on koishi dataset (commit c83d922) for 1 epoch

then tuned on pippa dataset (commit 6412b0c) for 1 epoch

then tuned on geepeetee4 dataset (commit c83d922) for 1 epoch

then tuned on limarp (without ponyville, lolicit, and all the fallen subsets. Version 2023-09-14) for 2 epochs