Edit model card

Writer/palmyra-20b-chat


Usage


import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer

model_name = "Writer/palmyra-20b-chat"

tokenizer = AutoTokenizer.from_pretrained(model_name)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto",
)

prompt = "What is the meaning of life?"

input_text = (
    "A chat between a curious user and an artificial intelligence assistant. "
    "The assistant gives helpful, detailed, and polite answers to the user's questions. "
    "USER: {prompt} "
    "ASSISTANT:"
)

model_inputs = tokenizer(input_text.format(prompt=prompt), return_tensors="pt").to(
    "cuda"
)

gen_conf = {
    "top_k": 20,
    "max_new_tokens": 2048,
    "temperature": 0.6,
    "do_sample": True,
    "eos_token_id": tokenizer.eos_token_id,
}

streamer = TextStreamer(tokenizer)
if "token_type_ids" in model_inputs:
    del model_inputs["token_type_ids"]

all_inputs = {**model_inputs, **gen_conf}
output = model.generate(**all_inputs, streamer=streamer)

print("-"*20)
print(output)

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 38.97
ARC (25-shot) 43.52
HellaSwag (10-shot) 72.83
MMLU (5-shot) 35.18
TruthfulQA (0-shot) 43.17
Winogrande (5-shot) 66.46
GSM8K (5-shot) 3.94
DROP (3-shot) 7.7
Downloads last month
1,632
Inference API
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Datasets used to train Writer/palmyra-20b-chat

Space using Writer/palmyra-20b-chat 1

Collection including Writer/palmyra-20b-chat