Edit model card

Chatty-2x8B

Description

After some testing, finetuning and multiple merges of Llama-3 LLM models, here is something a little different.

This model is a MoE of 2x Llama-3 model trained on different RP format.

This repo contains FP16 files of Chatty-2x8B.

The idea

I started with two separate Llama-3-Instruct-8B models, each fine-tuned for specific RP formats.

Here is two simple exemple of how it was trained.

  • Expert 1: This model is trained to handle RP that requires actions and descriptions between asterisks. For example:
    *nods* Yes, I understand.
    
  • Expert 2: This model is fine-tuned for plain text RP where characters’ dialogues and actions are described straightforwardly. For example:
    Nods. "Yes, I understand."
    

My initial idea was to make a 11B or bigger Llama-3 model, or just make a 2x8B from existing model, but I got some issues, they were not stable enough, even after DPO and FFT on top my frankenmerge/moe of Llama-3, it was not working well enough to release them.

So I just tried the idea of having 2 different RP format trained on 2 separated Llama-3-Instruct-8B, and it worked pretty well!

The dataset

Based on Lumimaid 8B OAS success I still used the same "balance" between RP and non RP in the dataset, the maximum was 50% non RP data on each side.

RP data was different with some exception, the non RP data was exactly the same, despite that, I can't produce repetition so the double usage of non RP datasets didn't hurt the model in the end.

Prompt template: Llama3

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>

{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{output}<|eot_id|>

Others

Undi: If you want to support us, you can here.

IkariDev: Visit my retro/neocities style website please kek

Tasks Version Filter n-shot Metric Value Stderr
arc_challenge 1 none 0 acc 0.5469 ± 0.0145
none 0 acc_norm 0.5853 ± 0.0144
arc_easy 1 none 0 acc 0.8308 ± 0.0077
none 0 acc_norm 0.8258 ± 0.0078
gsm8k 3 strict-match 5 exact_match 0.7149 ± 0.0124
flexible-extract 5 exact_match 0.7096 ± 0.0125
hellaswag 1 none 0 acc 0.5945 ± 0.0049
none 0 acc_norm 0.7806 ± 0.0041
piqa 1 none 0 acc 0.7943 ± 0.0094
none 0 acc_norm 0.7998 ± 0.0093
truthfulqa_mc2 2 none 0 acc 0.5097 ± 0.0150
winogrande 1 none 0 acc 0.7356 ± 0.0124
Downloads last month
20
Safetensors
Model size
13.7B params
Tensor type
BF16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Undi95/Llama-3-Chatty-2x8B

Quantizations
2 models