Edit model card

Synthia-v3.0-11B

SynthIA-v3.0-11B (Synthetic Intelligent Agent) is a general purpose Large Language Model (LLM). It was trained on the Synthia-v3.0 dataset that contains the Generarized Tree-of-Thought prompt plus 10 more new long-form system contexts.

This model was trained on the principles of LIMA (Less Is More for Alignment) paper, with ~10K high-quality samples generated using GPT-4-Turbo. It has been fine-tuned for instruction following as well as having long-form conversations.


Synthia


To evoke generalized Tree of Thought + Chain of Thought reasoning, you may use the following system message:

Elaborate on the topic using a Tree of Thoughts and backtrack when necessary to construct a clear, cohesive Chain of Thought reasoning. Always answer without hesitation.

Evaluation

We evaluated Synthia-v3.0-11B on a wide range of tasks using Language Model Evaluation Harness from EleutherAI.

Here are the results on metrics used by HuggingFaceH4 Open LLM Leaderboard. Section to follow.

Task Metric Value
arc_challenge acc_norm
hellaswag acc_norm
mmlu acc_norm
truthfulqa_mc mc2
Total Average -

Example Usage

Here is prompt format:

SYSTEM: Elaborate on the topic using a Tree of Thoughts and backtrack when necessary to construct a clear, cohesive Chain of Thought reasoning. Always answer without hesitation.
USER: What is the difference between an Orca, Dolphin and a Seal?
ASSISTANT:

Below shows a code example on how to use this model:

import torch, json
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "migtissera/Synthia-v3.0-11B"
output_file_path = "./Synthia-v3.0-11B-conversations.jsonl"

model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.float16,
    device_map="auto",
    load_in_8bit=False,
    trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)


def generate_text(instruction):
    tokens = tokenizer.encode(instruction)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to("cuda")

    instance = {
        "input_ids": tokens,
        "top_p": 1.0,
        "temperature": 0.75,
        "generate_len": 1024,
        "top_k": 50,
    }

    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens,
            max_length=length + instance["generate_len"],
            use_cache=True,
            do_sample=True,
            top_p=instance["top_p"],
            temperature=instance["temperature"],
            top_k=instance["top_k"],
            num_return_sequences=1,
        )
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    answer = string.split("USER:")[0].strip()
    return f"{answer}"


conversation = f"SYSTEM: Elaborate on the topic using a Tree of Thoughts and backtrack when necessary to construct a clear, cohesive Chain of Thought reasoning. Always answer without hesitation."


while True:
    user_input = input("You: ")
    llm_prompt = f"{conversation} \nUSER: {user_input} \nASSISTANT: "
    answer = generate_text(llm_prompt)
    print(answer)
    conversation = f"{llm_prompt}{answer}"
    json_data = {"prompt": user_input, "answer": answer}

    ## Save your conversation
    with open(output_file_path, "a") as output_file:
        output_file.write(json.dumps(json_data) + "\n")

Limitations & Biases:

While this model aims for accuracy, it can occasionally produce inaccurate or misleading results.

Despite diligent efforts in refining the pretraining data, there remains a possibility for the generation of inappropriate, biased, or offensive content.

Exercise caution and cross-check information when necessary. This is an uncensored model.


Downloads last month
1,098
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using migtissera/Synthia-v3.0-11B 1