Edit model card

Athene-Llama3-70B

Warning. This repo is a copy of the initial release of Athene-70B. Please refer to the link here for the latest version.

We introduce Athene-Llama3-70B, an open-weights LLM trained through RLHF based off Llama-3-70B-Instruct. Athene-70B achieves a high score on Arena-Hard-Auto, a proxy benchmark for Chatbot Arena.

  • Developed by: The Nexusflow Team (Evan Frick*, Peter Jin*, Tianle Li*, Karthik Ganesan, Jian Zhang, Jiantao Jiao and Banghua Zhu).
  • Model type: Chat Model
  • Finetuned from model: Llama-3-70B-Instruct.

Blog: https://nexusflow.ai/blogs/athene

Model Arena-Hard
Claude-3.5-Sonnet (Proprietary) 79.3%
GPT-4o (Proprietary) 79.2%
Athene-70B (Open) 77.8%
Gemini-Pro-1.5 (Proprietary) 72.0%
Gemma-2-27B (Open) 57.0%
Llama-3-70B (Open) 46.6%

Usage

Athene-70B uses the same chat template as Llama-3-70B-Instruct. Below is an example simple usage using the Transformers library.

import transformers
import torch

model_id = "Nexusflow/Athene-70B"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are an Athene Noctura, you can only speak with owl sounds. Whoooo whooo."},
    {"role": "user", "content": "Whooo are you?"},
]

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>")
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][-1])

Acknowledgment

We would like to thank the LMSYS Organization for their support of testing the model. We would like to thank Meta AI and the open source community for their efforts in providing the datasets and base models.

Citation

@misc{Athene2024,
    title = {Athene-70B: Redefining the Boundaries of Post-Training for Open Models},
    url = {https://nexusflow.ai/blogs/athene},
    author = {Frick, Evan and Jin, Peter and Li, Tianle and Ganesan, Karthik and Zhang, Jian and Jiao, Jiantao and Zhu, Banghua},    
    month = {July},
    year = {2024}
}
Downloads last month
16
Safetensors
Model size
70.6B params
Tensor type
BF16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including Nexusflow/Athene-70B-Preview