Llama3-8B-Instruct-Slerp

Llama3-8B-Instruct-Slerp is a merge of the following models using LazyMergekit:

🧩 Configuration

slices:
  - sources:
      - model: yuvraj17/EvolCodeLlama-3.1-8B-Instruct
        layer_range: [0, 32]
      - model: yzhuang/Meta-Llama-3-8B-Instruct_fictional_gsm8k_English_v1
        layer_range: [0, 32]
merge_method: slerp
base_model: yuvraj17/EvolCodeLlama-3.1-8B-Instruct
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5
dtype: float16

💻 Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "yuvraj17/Llama3-8B-Instruct-Slerp"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=512, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

A large language model is a computer program that can process and generate human language. It is typically trained on a large corpus of text data and uses machine learning algorithms to learn the patterns and structures of language. Large language models are able to generate coherent and context-specific text based on a given prompt or input. There are two main types of large language models: generative and discriminative. Generative models aim to generate new text based on the patterns they've learned from the training data. Discriminative models aim to classify or label the input text as belonging to a particular class or genre. Some of the key characteristics of large language models include:

Scale: Large language models are trained on vast amounts of text data, often measured in gigabytes or terabytes. This large scale allows them to learn complex patterns and structures of language.

Complexity: The internal workings of a large language model can be quite complex. They use advanced algorithms and mathematical techniques, such as transformers and recurrent neural networks, to process and generate text.

Flexibility: Large language models can be fine-tuned for specific tasks or domains by adjusting their hyperparameters or adding small amounts of task-specific data. This allows them to adapt to new domains or tasks.

Interpretability: While large language models are not always interpretable, some models provide insights into their decision-making process. This can be useful for tasks like question-answering or text summarization, where understanding the reasoning behind the model's output is important.

Evaluation: The performance of a large language model is typically evaluated using metrics such as perplexity, BLEU, or ROUGE. These metrics measure how well the model can predict the next word or sentence in a sequence, or how well it can generate coherent and meaningful text.

Some examples of large language models include:

BERT (Bidirectional Encoder Representations from Transformers): Developed by Google, BERT is a widely used pre-trained language model that can be fine-tuned for a variety of NLP tasks.

RoBERTa (Robust BERT): Developed by Salesforce, RoBERTa is a variation of BERT that has been fine-tuned on a larger corpus of text data.

XLNet (Extreme Large Neural Network): Developed by Microsoft Research, XLNet is a large transformer-based language model that can process long-range dependencies in text data.

T5 (Text-to-Text Transfer Transformer): Developed by Google, T5 is a large encoder-decoder model that can perform a wide

🏆 Evaluation Scores

Nous

Model	AGIEval	TruthfulQA	Bigbench
yuvraj17/Llama3-8B-Instruct-Slerp	33.28	49.78	35.38

AGIEval

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	23.62	±	2.67
		acc_norm	22.05	±	2.61
agieval_logiqa_en	0	acc	27.50	±	1.75
		acc_norm	31.80	±	1.83
agieval_lsat_ar	0	acc	21.30	±	2.71
		acc_norm	20.87	±	2.69
agieval_lsat_lr	0	acc	35.29	±	2.12
		acc_norm	37.65	±	2.15
agieval_lsat_rc	0	acc	42.01	±	3.01
		acc_norm	39.78	±	2.99
agieval_sat_en	0	acc	55.83	±	3.47
		acc_norm	50.49	±	3.49
agieval_sat_en_without_passage	0	acc	36.89	±	3.37
		acc_norm	34.95	±	3.33
agieval_sat_math	0	acc	29.55	±	3.08
		acc_norm	28.64	±	3.05

Average score: 33.28%

TruthfulQA

Task	Version	Metric	Value		Stderr
truthfulqa_mc	1	mc1	33.54	±	1.65
		mc2	49.78	±	1.53

Average score: 49.78%

BigBench

Task	Metric	Value		Stderr
bigbench_causal_judgement	multiple_choice_grade	47.89	±	3.63
bigbench_date_understanding	multiple_choice_grade	39.02	±	2.54
bigbench_disambiguation_qa	multiple_choice_grade	33.72	±	2.95
bigbench_geometric_shapes	multiple_choice_grade	20.61	±	2.14
bigbench_logical_deduction_five_objects	multiple_choice_grade	31.40	±	2.08
bigbench_logical_deduction_seven_objects	multiple_choice_grade	23.71	±	1.61
bigbench_logical_deduction_three_objects	multiple_choice_grade	47.00	±	2.89
bigbench_movie_recommendation	multiple_choice_grade	27.40	±	1.99
bigbench_navigate	multiple_choice_grade	50.10	±	1.58
bigbench_reasoning_about_colored_objects	multiple_choice_grade	38.40	±	1.09
bigbench_ruin_names	multiple_choice_grade	27.23	±	2.11
bigbench_salient_translation_error_detection	multiple_choice_grade	25.45	±	1.38
bigbench_snarks	multiple_choice_grade	46.41	±	3.72
bigbench_sports_understanding	multiple_choice_grade	50.30	±	1.59
bigbench_temporal_sequences	multiple_choice_grade	37.30	±	1.53
bigbench_tracking_shuffled_objects_five_objects	multiple_choice_grade	21.36	±	1.16
bigbench_tracking_shuffled_objects_seven_objects	multiple_choice_grade	17.14	±	0.90
bigbench_tracking_shuffled_objects_three_objects	multiple_choice_grade	47.00	±	2.89

Average score: 35.38%

yuvraj17
/

Llama3-8B-Instruct-Slerp

Llama3-8B-Instruct-Slerp

🧩 Configuration

💻 Usage

🏆 Evaluation Scores

Nous

AGIEval

TruthfulQA

BigBench

Model tree for yuvraj17/Llama3-8B-Instruct-Slerp

Collection including yuvraj17/Llama3-8B-Instruct-Slerp

Model-Merges