anthracite-org
/

magnum-v2-72b

Text Generation

Model card Files Files and versions Community

This is the seventh (Lucky!) in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. This model is fine-tuned on top of Qwen-2 72B Instruct.

Prompting

Model has been Instruct tuned with the ChatML formatting. A typical input would look like this:

"""<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant
"""

Credits

This model has been a team effort, and the credits goes to all members of Anthracite.

Training

The training was done for 2 epochs. We used 8x AMD Instinct™ MI300X Accelerators for the full-parameter fine-tuning of the model.

We also trained with a weight decay of 0.01 to help further stabilize the loss trajectory and mitigate catastrophic forgetting, and utilize a peak learning rate of 4e-6 to prevent the 2nd epoch loss from dropping too significantly (as it is a strong indicator of overfitting).

Sample Packing was done for 16k tokens rather than the 8k tokens used in our previous runs.

Safety

...

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	41.15
IFEval (0-Shot)	75.60
BBH (3-Shot)	57.85
MATH Lvl 5 (4-Shot)	31.65
GPQA (0-shot)	18.12
MuSR (0-shot)	14.18
MMLU-PRO (5-shot)	49.51

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	41.15
IFEval (0-Shot)	75.60
BBH (3-Shot)	57.85
MATH Lvl 5 (4-Shot)	31.65
GPQA (0-shot)	18.12
MuSR (0-shot)	14.18
MMLU-PRO (5-shot)	49.51

Downloads last month: 174

Safetensors

Model size

72.7B params

Tensor type

BF16

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for anthracite-org/magnum-v2-72b

Base model

Qwen/Qwen2-72B

Finetuned

Qwen/Qwen2-72B-Instruct

Finetuned

(5)

this model

Quantizations

Datasets used to train anthracite-org/magnum-v2-72b

Space using anthracite-org/magnum-v2-72b 1

Collection including anthracite-org/magnum-v2-72b

magnum-v2

12 items • Updated Aug 23, 2024 • 7

Evaluation results

strict accuracy on IFEval (0-Shot)
Open LLM Leaderboard

75.600
normalized accuracy on BBH (3-Shot)
Open LLM Leaderboard

57.850
exact match on MATH Lvl 5 (4-Shot)
Open LLM Leaderboard

31.650
acc_norm on GPQA (0-shot)
Open LLM Leaderboard

18.120
acc_norm on MuSR (0-shot)
Open LLM Leaderboard

14.180
accuracy on MMLU-PRO (5-shot)
test set Open LLM Leaderboard

49.510

View on Papers With Code