serpdotai/sparsetral-16x7B-v2-SPIN_iter1

This model is sparsetral-16x7B-v2 further tuned utilizing SPIN on OpenHermes-2.5 mixed with traditional DPO samples. This is iteration_1, temporarily pausing further training runs in favor of utilizing DoRA over LoRA. May also start from the beginning with v3 for proper chat token support, also debating adding function tokens + function calling. If you have any tasks that Sparsetral has been weak at, feel free to send us some prompts/chats + desired completions and we will see about making sure your task is supported!

Kuru~ Kuru~

Training

8x A6000s
Base model is sparsetral-16x7B-v2-SPIN_iter0
Forked version of unsloth for efficient training
Sequence Length: 4096
Effective batch size: 64
Learning Rate: 5e-7 with linear decay (0.1 warmup ratio)
Epochs: 2
100k samples (50k new SPIN + 50k from iter_0)

QLoRA:

256 r and 256 alpha

target_modules=[
    "q_proj",
    "k_proj",
    "v_proj",
    "o_proj",
    "gate_proj",
    "up_proj",
    "down_proj",
    "adapter_down",
    "adapter_up",
]

Prompt Format

<|im_start|>system\n{message}<|im_end|>\n<|im_start|>user\n{message}<|im_end|>\n<|im_start|>assistant\n

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("serpdotai/sparsetral-16x7B-v2-SPIN_iter0", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("serpdotai/sparsetral-16x7B-v2-SPIN_iter0", device_map="auto", trust_remote_code=True).eval()

system_str = "<|im_start|>system\n{message}<|im_end|>\n"
user_str = "<|im_start|>user\n{message}<|im_end|>\n"
assistant_str = "<|im_start|>assistant\n{message}<|im_end|>\n"

def construct_prompt(messages):
    prompt = ""
    for message in messages:
        if message["from"] in ["human", "user"]:
            prompt += user_str.format(
                message=message["value"]
            )
        elif message["from"] in ["gpt", "assistant"]:
            prompt += assistant_str.format(
                message=message["value"]
            )
        elif message["from"] in ["system", "instruction"]:
            prompt += system_str.format(
                message=message["value"]
            )
        else:
            raise ValueError(
                f"Unknown message type: {message['from']}"
            )
    return prompt + "<|im_start|>assistant\n"

system = "You are a helpful assistant who will help the user to the best of their ability. If you don't know something, say \"I don't know\""
user = "Are you sentient?"

messages = [
    {"from": "system", "value": system},
    {"from": "user", "value": user},
]

prompt = construct_prompt(messages)
inputs = tokenizer(prompt, return_tensors="pt")
inputs = inputs.to(model.device)
pred = model.generate(**inputs, max_length=4096, do_sample=True, top_k=50, top_p=0.99, temperature=0.9, num_return_sequences=1)
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))

Other Information

Paper reference: Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks

Original Paper repo

Forked repo with mistral support (sparsetral)

If you are interested in faster inferencing, check out our fork of vLLM that adds sparsetral support

serpdotai
/

sparsetral-16x7B-v2-SPIN_iter1

Training

Prompt Format

Usage

Other Information

Datasets used to train serpdotai/sparsetral-16x7B-v2-SPIN_iter1