lemur-70b-v1 / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
255becb
|
raw
history blame
2.85 kB
metadata
pipeline_tag: text-generation
inference: true
widget:
  - text: 'def factorial(n):'
    example_title: Factorial
    group: Python
  - text: 'def recur_fibo(n):'
    example_title: Recursive Fibonacci
    group: Python
license: llama2
library_name: transformers
tags:
  - text-generation
  - code
language:
  - en

lemur-70b-v1

Lemur

📄Paper: https://arxiv.org/abs/2310.06830

👩‍💻Code: https://github.com/OpenLemur/Lemur

Use

Setup

First, we have to install all the libraries listed in requirements.txt in GitHub:

pip install -r requirements.txt

Intended use

Since it is not trained on instruction following corpus, it won't respond well to questions like "What is the Python code to do quick sort?".

Generation

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("OpenLemur/lemur-70b-v1")
model = AutoModelForCausalLM.from_pretrained("OpenLemur/lemur-70b-v1", device_map="auto", load_in_8bit=True)

# Text Generation Example
prompt = "The world is "
input = tokenizer(prompt, return_tensors="pt")
output = model.generate(**input, max_length=50, num_return_sequences=1)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

# Code Generation Example
prompt = """
def factorial(n):
    if n == 0:
        return 1
"""
input = tokenizer(prompt, return_tensors="pt")
output = model.generate(**input, max_length=200, num_return_sequences=1)
generated_code = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_code)

License

The model is licensed under the Llama-2 community license agreement.

Acknowledgements

The Lemur project is an open collaborative research effort between XLang Lab and Salesforce Research. We thank Salesforce, Google Research and Amazon AWS for their gift support.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 54.03
ARC (25-shot) 64.33
HellaSwag (10-shot) 85.72
MMLU (5-shot) 65.85
TruthfulQA (0-shot) 44.78
Winogrande (5-shot) 83.03
GSM8K (5-shot) 28.73
DROP (3-shot) 5.74