big thanks to lore for the 8xH100 gpus

training

base model is meta llama 3 8b instruct trained on pippa then i trained that model on limarp, both at 8k context for 2 epochs each

gen settings

i would start with every sampler off and temperature at 1 and just make min p 0.05, i got good prompts from this but u can also try to gen settings from shori which are copy pasted below

  • Main choice (may have repetition issues)
    • Temperature: 1.0; Min-P: 0.05-0.10; Presence Penalty: 0.35-0.45
  • Alternative 1 (appears to solve repetition issues while being coherent, but reponses might possibly be less truthful)
    • Temperature: 2.40-2.50; Min-P: 0.40; Frequency penalty: 0.10-0.15; Temperature last.
  • Alternative 2
    • Mirostat type: 2, Mirostat Tau: 2.80-3.00; Mirostat Eta: 0.0175-0.0200; neutralize or disable all other samplers

prompting

use the llama 3 instruct format

<|eot_id|> as stopping sequence/string/token

ST jsons: instruct context

agnaistic prompt:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>{{#if system}}<|begin_of_text|><|start_header_id|>system<|end_header_id|>{{system}}<|eot_id|>{{/if}}Write {{char}}'s next reply in a fictional roleplay chat between {{#each bot}}{{.name}}, {{/each}}{{char}} and {{user}}.

{{char}}'s Persona: {{personality}}

{{#if memory}}
Important details:
{{memory}}
{{/if}}

{{#if example_dialogue}}This is how {{char}} should talk:
{{example_dialogue}}{{/if}}

This scenario of the conversation: {{scenario}}

Then the roleplay chat between {{#each bot}}{{.name}}, {{/each}}{{char}} and {{user}} begins.<|eot_id|>

{{#each msg}}{{#if .isbot}}<|start_header_id|>response<|end_header_id|>{{/if}}{{#if .isuser}}<|start_header_id|>user<|end_header_id|>{{/if}}{{.name}}: {{.msg}}<|eot_id|>
{{/each}}
{{#if ujb}}<|begin_of_text|><|start_header_id|>system<|end_header_id|>{{ujb}}<|eot_id|>{{/if}}
<|start_header_id|>response<|end_header_id|>{{post}}
Downloads last month
18
Safetensors
Model size
70.6B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ludis/tsukasa-llama-3-70b-qlora

Quantizations
2 models

Datasets used to train ludis/tsukasa-llama-3-70b-qlora

Collection including ludis/tsukasa-llama-3-70b-qlora