Dracarys2-Llama-3.1-70B-Instruct

Built with Meta Llama 3

Introduction

We introduce the latest in the Smaug series, the Dracarys family of finetunes targeting coding performance improvements across a variety of base models.

This variant is a finetune of meta-llama/Meta-Llama-3.1-70B-Instruct

Compared to meta-llama/Meta-Llama-3.1-70B-Instruct, Dracarys has better LiveCodeBench scores (see evaluation results below).

Model Description

How to use

The prompt format is unchanged from Llama 3 70B Instruct (see evaluations for prompt details for LCB)

Use with transformers

See the snippet below for usage with Transformers:

import transformers
import torch

model_id = "abacusai/Dracarys-72B-Instruct"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are data science coding assistant that generates Python code using Pandas and Numpy."},
    {"role": "user", "content": "Write code to select rows from the dataframe `df` having the maximum `temp` for each `city`"},
]

prompt = pipeline.tokenizer.apply_chat_template(
        messages, 
        tokenize=False, 
        add_generation_prompt=True
)

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>"),
    pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>"),
]

outputs = pipeline(
    prompt,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][len(prompt):])

Evaluation Results

LiveCodeBench

Model Code Generation Code Execution Test Output Prediction
Dracarys2-Llama-3.1-70B-Instruct 33.44 48.26 52.10
Meta-Llama-3.1-70B-Instruct 32.23 48.768 41.40

Breakdown of LiveCodeBench CodeGeneration

Model Easy Medium Hard
Dracarys2-Llama-3.1-70B-Instruct 71.29 18.48 3.57
Meta-Llama-3.1-70B-Instruct 68.4 17.99 3.57

Breakdown of LiveCodeBench CodeExecution

Model COT Non-COT
Dracarys2-Llama-3.1-70B-Instruct 75.55 48.26
Meta-Llama-3.1-70B-Instruct 70.14 48.768

Breakdown of LiveCodeBench TestOutputPrediction

Model Easy Medium Hard
Dracarys2-Llama-3.1-70B-Instruct 63.53 47.30 43.61
Meta-Llama-3.1-70B-Instruct 51.22 35.91 34.30

LiveBench(Aug update)

Model Global Average Coding Average Reasoning Average Mathematics Average Data Analysis Average Language Average IF Average
Dracarys2-Llama-3.1-70B-Instruct 47.8 36.3 47.3 38.9 46.1 41.5 76.6
Meta-Llama-3.1-70B-Instruct 45.1 30.7 35.3 37.0 48.4 42.1 77.2
Downloads last month
64
Safetensors
Model size
70.6B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for abacusai/Dracarys2-Llama-3.1-70B-Instruct

Quantizations
2 models