Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.


Scoring 04-February-2024 #1 34B model, outperforming its original base model Smaug-34B-v0.1 with 77.41 😎 Oh, btw.. this one went thru SFT so the abacus inside Smaug is back to normal.. so you can further train/dpo him .. RESET!

UNA Applied UNA only on the Attention, not on the MLP's

  • Is based on Smaug
  • SimpleMath dataset
  • It was trained on Axolotl


The thing here is to understand whats the impact of SimpleMath applied at the attention layer during a SFT session and how it impacts on the neural network overall.

Results: Improving mathematican and reasoning capabilities without degrading and presserving previous training sessions.


Pending, but so far this one

|    Task     |Version| Metric |Value            |
|arc_challenge|     HF|acc_norm| 0.7457337883959 |
|gsm8k        |     HF|acc     | 0.7247915087187 |
|mmlu         |     HF|acc     | 0.7649553475572 |
|mmlu         |     HF|acc_norm| 0.7681713551647 |
|hellaswag    |     HF|acc_norm| 0.8673571001792 | 
|truthfulqa   |     HF|mc2     | 0.7016557407771 |
|winogrande   |     HF|acc     | 0.8382004735595 |

Increasing GSM, MMLU, ARC, WINO.


To abacusai for making Smaug-34B, the Bagel, and all the magic behind the base model.

If you use the model, provide citation even for merges or anything. And enjoy our ModelSimilarities tool detector https://github.com/fblgit/model-similarity

Downloads last month
Model size
34.4B params
Tensor type

Finetuned from

Datasets used to train fblgit/UNA-SimpleSmaug-34b-v1beta