lxe's picture
Adding Evaluation Results (#2)
914052b
metadata
license: apache-2.0
tags:
  - code

Cerebras-GPT-2.7B-Alpaca-SP

Cerebras-GPT-2.7B-Alpaca-SP is cerebras/Cerebras-GPT-2.7B finetuned on a modified Alpaca dataset with a shorter prompt structure (Human:...\n\nAssistant:). This contains the float16 model merged with lxe/lora-cerebras-gpt2.7b-alpaca-shortprompt LoRA.

Limitations:

Although mostly coherent, the model tends to hallucinate signinficantly, often producing erroneous information.

Basic Usage:

tokenizer = transformers.AutoTokenizer.from_pretrained('lxe/Cerebras-GPT-2.7B-Alpaca-SP')

model = transformers.AutoModelForCausalLM.from_pretrained(
    `lxe/Cerebras-GPT-2.7B-Alpaca-SP`,
    load_in_8bit=True,
    torch_dtype=torch.float16,
    device_map={"": 0},
)

prompt = "Human: how old is the sun?\n\nAssistant:"
input_ids = tokenizer.encode(prompt, return_tensors="pt").cuda()

with torch.no_grad():
    output = model.generate(
        input_ids=input_ids,
        max_length=100,
        early_stopping=True,
    )
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
Human: how old is the sun?

Assistant: The sun is approximately 4.5 billion years old.

Example Outputs:

Human: how old is the sun?
Assistant: The sun is approximately 4.5 billion years old.

Human: what are the highest grossing films of the 2000-2010 decade?
Assistant: The highest grossing film in 2010 was Avatar, which earned $2.5 billion worldwide and is currently ranked as one of the top 10 movies ever made by Box Office Mojo. Other notable high-grossing films include Titanic ($1.8 billion), Jurassic World ($1.6 billion) and Harry Potter And The Deathly Hallows Part 2 ($1.4 billion).

Human: how is cheese made?
Assistant: Cheese making involves the process of curdling milk, separating it into a liquid and solid phase, then heating that mixture to create a semi-solid mass. The resulting product can be either hard or soft depending on its composition. Hard cheeses are typically produced from cow's milk while softer varieties come from sheep's milk.

Human: print all letters of the alphabet in reverse
Assistant: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Human: write a javascript function that sorts an array using quicksort algorithm
Assistant:

function quicksort(array) {
    if (array.length < 2) {
        return array;
    }
    var pivot = array[0];
    var left = quicksort(array, 0, array.length - 1);
    var right = quicksort(array, array.length - 1, 0);
    if (left < pivot) {
        left = pivot;
    }
    if (right > pivot) {
        right = pivot;
    }
    if (left < right) {
        quicksort(array, left, right);
    }

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 29.4
ARC (25-shot) 30.8
HellaSwag (10-shot) 48.88
MMLU (5-shot) 25.12
TruthfulQA (0-shot) 40.24
Winogrande (5-shot) 55.41
GSM8K (5-shot) 0.53
DROP (3-shot) 4.78