OpenLemur
/

lemur-70b-v1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

lemur-70b-v1 / README.md

leaderboard-pr-bot's picture

leaderboard-pr-bot

Adding Evaluation Results

255becb about 1 year ago

|

2.85 kB

	---
	pipeline_tag: text-generation
	inference: true
	widget:
	- text: 'def factorial(n):'
	example_title: Factorial
	group: Python
	- text: 'def recur_fibo(n):'
	example_title: Recursive Fibonacci
	group: Python
	license: llama2
	library_name: transformers
	tags:
	- text-generation
	- code
	language:
	- en
	---

	# lemur-70b-v1

	<p align="center">
	<img src="https://huggingface.co/datasets/OpenLemur/assets/resolve/main/lemur_icon.png" width="300" height="300" alt="Lemur">
	</p>


	<div align="center">
	<img src="https://huggingface.co/datasets/OpenLemur/assets/resolve/main/lemur_base_radar.png">
	</div>

	📄Paper: https://arxiv.org/abs/2310.06830

	👩‍💻Code: https://github.com/OpenLemur/Lemur

	## Use

	### Setup

	First, we have to install all the libraries listed in `requirements.txt` in [GitHub](https://github.com/OpenLemur/lemur-v1):

	```bash
	pip install -r requirements.txt
	```

	### Intended use

	Since it is not trained on instruction following corpus, it won't respond well to questions like "What is the Python code to do quick sort?".

	### Generation

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("OpenLemur/lemur-70b-v1")
	model = AutoModelForCausalLM.from_pretrained("OpenLemur/lemur-70b-v1", device_map="auto", load_in_8bit=True)

	# Text Generation Example
	prompt = "The world is "
	input = tokenizer(prompt, return_tensors="pt")
	output = model.generate(**input, max_length=50, num_return_sequences=1)
	generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
	print(generated_text)

	# Code Generation Example
	prompt = """
	def factorial(n):
	if n == 0:
	return 1
	"""
	input = tokenizer(prompt, return_tensors="pt")
	output = model.generate(**input, max_length=200, num_return_sequences=1)
	generated_code = tokenizer.decode(output[0], skip_special_tokens=True)
	print(generated_code)
	```

	# License
	The model is licensed under the Llama-2 community license agreement.

	# Acknowledgements
	The Lemur project is an open collaborative research effort between [XLang Lab](https://www.xlang.ai/) and Salesforce Research. We thank Salesforce, Google Research and Amazon AWS for their gift support.
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_OpenLemur__lemur-70b-v1)

	\| Metric \| Value \|
	\|-----------------------\|---------------------------\|
	\| Avg. \| 54.03 \|
	\| ARC (25-shot) \| 64.33 \|
	\| HellaSwag (10-shot) \| 85.72 \|
	\| MMLU (5-shot) \| 65.85 \|
	\| TruthfulQA (0-shot) \| 44.78 \|
	\| Winogrande (5-shot) \| 83.03 \|
	\| GSM8K (5-shot) \| 28.73 \|
	\| DROP (3-shot) \| 5.74 \|