Edit model card

Uploaded model

  • License: apache-2.0
  • Finetuned from model : unsloth/llama-3-8b-bnb-4bit
  • Single Epoch - Alpaca Dataset


import torch
from unsloth import FastLanguageModel

max_seq_length = 2048
dtype = None
load_in_4bit = True

gpu_stats = torch.cuda.get_device_properties(0)
start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)
print(f"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.")
print(f"{start_gpu_memory} GB of memory reserved.")

alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""


model, tokenizer = FastLanguageModel.from_pretrained(
        model_name = "vincentoh/llama3-alpaca-dpo-instruct",
        max_seq_length = max_seq_length,
        dtype = dtype,
        load_in_4bit = load_in_4bit,
    )

FastLanguageModel.for_inference(model)

input_question = 'Why is the sky blue?'
inputs = tokenizer([alpaca_prompt.format(input_question,"","",)], return_tensors = "pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
print(tokenizer.batch_decode(outputs))

|                                          Model                                          |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
|-----------------------------------------------------------------------------------------|------:|------:|---------:|-------:|------:|
|[llama3-alpaca-dpo-instruct](https://huggingface.co/vincentoh/llama3-alpaca-dpo-instruct)|  30.67|  70.01|     46.56|    37.3|  46.14|

### AGIEval
|             Task             |Version| Metric |Value|   |Stderr|
|------------------------------|------:|--------|----:|---|-----:|
|agieval_aqua_rat              |      0|acc     |19.69|±  |  2.50|
|                              |       |acc_norm|22.83|±  |  2.64|
|agieval_logiqa_en             |      0|acc     |29.34|±  |  1.79|
|                              |       |acc_norm|32.72|±  |  1.84|
|agieval_lsat_ar               |      0|acc     |20.87|±  |  2.69|
|                              |       |acc_norm|20.43|±  |  2.66|
|agieval_lsat_lr               |      0|acc     |34.31|±  |  2.10|
|                              |       |acc_norm|31.57|±  |  2.06|
|agieval_lsat_rc               |      0|acc     |43.87|±  |  3.03|
|                              |       |acc_norm|35.32|±  |  2.92|
|agieval_sat_en                |      0|acc     |51.46|±  |  3.49|
|                              |       |acc_norm|39.81|±  |  3.42|
|agieval_sat_en_without_passage|      0|acc     |37.38|±  |  3.38|
|                              |       |acc_norm|28.16|±  |  3.14|
|agieval_sat_math              |      0|acc     |37.73|±  |  3.28|
|                              |       |acc_norm|34.55|±  |  3.21|

Average: 30.67%

### GPT4All
|    Task     |Version| Metric |Value|   |Stderr|
|-------------|------:|--------|----:|---|-----:|
|arc_challenge|      0|acc     |50.60|±  |  1.46|
|             |       |acc_norm|53.41|±  |  1.46|
|arc_easy     |      0|acc     |79.84|±  |  0.82|
|             |       |acc_norm|78.28|±  |  0.85|
|boolq        |      1|acc     |80.18|±  |  0.70|
|hellaswag    |      0|acc     |59.39|±  |  0.49|
|             |       |acc_norm|78.33|±  |  0.41|
|openbookqa   |      0|acc     |34.40|±  |  2.13|
|             |       |acc_norm|45.20|±  |  2.23|
|piqa         |      0|acc     |79.54|±  |  0.94|
|             |       |acc_norm|80.85|±  |  0.92|
|winogrande   |      0|acc     |73.80|±  |  1.24|

Average: 70.01%

### TruthfulQA
|    Task     |Version|Metric|Value|   |Stderr|
|-------------|------:|------|----:|---|-----:|
|truthfulqa_mc|      1|mc1   |30.23|±  |  1.61|
|             |       |mc2   |46.56|±  |  1.40|

Average: 46.56%

### Bigbench
|                      Task                      |Version|       Metric        |Value|   |Stderr|
|------------------------------------------------|------:|---------------------|----:|---|-----:|
|bigbench_causal_judgement                       |      0|multiple_choice_grade|54.74|±  |  3.62|
|bigbench_date_understanding                     |      0|multiple_choice_grade|67.75|±  |  2.44|
|bigbench_disambiguation_qa                      |      0|multiple_choice_grade|29.07|±  |  2.83|
|bigbench_geometric_shapes                       |      0|multiple_choice_grade|27.86|±  |  2.37|
|                                                |       |exact_str_match      | 0.00|±  |  0.00|
|bigbench_logical_deduction_five_objects         |      0|multiple_choice_grade|24.80|±  |  1.93|
|bigbench_logical_deduction_seven_objects        |      0|multiple_choice_grade|17.00|±  |  1.42|
|bigbench_logical_deduction_three_objects        |      0|multiple_choice_grade|42.33|±  |  2.86|
|bigbench_movie_recommendation                   |      0|multiple_choice_grade|30.80|±  |  2.07|
|bigbench_navigate                               |      0|multiple_choice_grade|55.60|±  |  1.57|
|bigbench_reasoning_about_colored_objects        |      0|multiple_choice_grade|54.65|±  |  1.11|
|bigbench_ruin_names                             |      0|multiple_choice_grade|32.37|±  |  2.21|
|bigbench_salient_translation_error_detection    |      0|multiple_choice_grade|28.66|±  |  1.43|
|bigbench_snarks                                 |      0|multiple_choice_grade|46.41|±  |  3.72|
|bigbench_sports_understanding                   |      0|multiple_choice_grade|55.48|±  |  1.58|
|bigbench_temporal_sequences                     |      0|multiple_choice_grade|25.30|±  |  1.38|
|bigbench_tracking_shuffled_objects_five_objects |      0|multiple_choice_grade|21.36|±  |  1.16|
|bigbench_tracking_shuffled_objects_seven_objects|      0|multiple_choice_grade|14.91|±  |  0.85|
|bigbench_tracking_shuffled_objects_three_objects|      0|multiple_choice_grade|42.33|±  |  2.86|

Average: 37.3%

Average score: 46.14%
Downloads last month
8
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for vincentoh/llama3-alpaca-dpo-instruct

Quantizations
1 model