fblgit's picture
Update README.md
f12aac9 verified
|
raw
history blame
No virus
8.97 kB
---
license: afl-3.0
library_name: transformers
tags:
- UNA
- juanako
datasets:
- jondurbin/py-dpo-v0.1
- Replete-AI/code_bagel_hermes-2.5
- mlabonne/orpo-dpo-mix-40k
model-index:
- name: UNA-ThePitbull-21.4B-v2
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 77.73
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=fblgit/UNA-ThePitbull-21.4B-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 91.79
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=fblgit/UNA-ThePitbull-21.4B-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 68.25
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=fblgit/UNA-ThePitbull-21.4B-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 78.24
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=fblgit/UNA-ThePitbull-21.4B-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 87.37
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=fblgit/UNA-ThePitbull-21.4B-v2
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 63.53
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=fblgit/UNA-ThePitbull-21.4B-v2
name: Open LLM Leaderboard
---
# UNA-ThePitbull 21.4B v2
Introducing the best LLM in the industry. Nearly as good as a 70B, just a 21.4B based on saltlux/luxia-21.4b-alignment-v1.0
![UNA - ThePitbull 21.4B v2](https://huggingface.co/fblgit/UNA-ThePitbull-21.4B-v2/resolve/main/DE-UNA-ThePitbull-21.4B-v2.png)
This model has not been poisoned to score high and be useless. We release him becaues its the real deal of EQ & IQ all together in a crazy powerful smart and conversational model.
Quant Versions available at [bartowski/UNA-ThePitbull-21.4B-v2-GGUF](https://huggingface.co/bartowski/UNA-ThePitbull-21.4B-v2-GGUF)
## Difference V1 vs V2
On V2 we implemented a different UNA strategy and covered partially the MLP's and Attention Layers.
We also performed further SFT over V1 and further DPO over V1 and we'll release some of those soon as well.
### Changes
1. SFT over V1 with `Replete-AI/code_bagel_hermes-2.5` at 1.0e-4 till 5.0e-5 for 1 epoch
2. DPO with: 1.0e-4 to min_lr 5.0e-5 for 1 epoch
* `mlabonne/orpo-dpo-mix-40k`
* `jondurbin/py-dpo-v0.1`
# Evaluations
## [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_fblgit__UNA-ThePitbull-21.4B-v2)
| Metric |Value|
|---------------------------------|----:|
|Avg. |77.82|
|AI2 Reasoning Challenge (25-Shot)|77.73|
|HellaSwag (10-Shot) |91.79|
|MMLU (5-Shot) |68.25|
|TruthfulQA (0-shot) |78.24|
|Winogrande (5-shot) |87.37|
|GSM8k (5-shot) |63.53|
Can only be compared with its non-una base model: the original luxia-21.4b and ThePitbull-v1
## UNA v2 (VLLM) Evaluations:
```
vllm (pretrained=/data/tools/mergekit/una-thepitbull-v5,dtype=bfloat16,gpu_memory_utilization=0.8,max_model_len=2048,data_parallel_size=2,tensor_parallel_size=4), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr|
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:|
|gsm8k | 3|strict-match | 5|exact_match|0.7695|± |0.0116|+
| | |flexible-extract| 5|exact_match|0.7695|± |0.0116|+
|hellaswag | 1|none | 10|acc |0.8110|± |0.0039|
| | |none | 10|acc_norm |0.9169|± |0.0028|+
|winogrande | 1|none | 5|acc |0.8777|± |0.0092|+
|mmlu |N/A |none | 0|acc |0.6427|± |0.0038|-
|arc_challenge | 1|none | 25|acc |0.7713|± |0.0123|
| | |none | 25|acc_norm |0.7875|± |0.0120|+
|truthfulqa_mc2| 2|none | 0|acc |0.7824|± |0.0135|-
|mathqa | 1|none | 0|acc |0.4037|± | 0.009|
| | |none | 0|acc_norm |0.4034|± | 0.009|+
|pubmedqa | 1|none | 0|acc |0.7260|± | 0.020|+
|boolq | 2|none | 0|acc |0.8602|± |0.0061|+
```
## UNA v1 (VLLM) Evaluations
```
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr|
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:|
|gsm8k | 3|strict-match | 5|exact_match|0.7566|± |0.0118|
| | |flexible-extract| 5|exact_match|0.7582|± |0.0118|
|hellaswag | 1|none | 10|acc |0.8168|± |0.0039|
| | |none | 10|acc_norm |0.9188|± |0.0027|
|winogrande | 1|none | 5|acc |0.8635|± |0.0097|
|mmlu | N/A|none | 0|acc |0.6444|± |0.0038|
|arc_challenge | 1|none | 25|acc |0.7747|± |0.0122|
| | |none | 25|acc_norm |0.7850|± |0.0120|
|truthfulqa_mc2| 2|none | 0|acc |0.7902|± |0.0134|
|mathqa | 1|none | 0|acc |0.4030|± | 0.009|
| | |none | 0|acc_norm |0.4034|± | 0.009|
|pubmedqa | 1|none | 0|acc |0.6860|± |0.0208|
|boolq | 2|none | 0|acc |0.8401|± |0.0064|
```
## Original (VLLM) Evaluations
```
| Tasks |Version| Filter |n-shot| Metric |Value | |Stderr|
|--------------|------:|----------------|-----:|-----------|-----:|---|-----:|
|gsm8k | 3|strict-match | 5|exact_match|0.7528|± |0.0119|
| | |flexible-extract| 5|exact_match|0.7521|± |0.0119|
|hellaswag | 1|none | 10|acc |0.8117|± |0.0039|
| | |none | 10|acc_norm |0.9167|± |0.0028|
|winogrande | 1|none | 5|acc |0.8682|± |0.0095|
|mmlu | N/A|none | 0|acc |0.6448|± |0.0038|
|arc_challenge | 1|none | 25|acc |0.7688|± |0.0123|
| | |none | 25|acc_norm |0.7730|± |0.0122|
|truthfulqa_mc2| 2|none | 0|acc |0.7895|± |0.0133|
|mathqa | 1|none | 0|acc |0.4000|± | 0.009|
| | |none | 0|acc_norm |0.4003|± | 0.009|
|pubmedqa | 1|none | 0|acc |0.6680|± |0.0211|
|boolq | 2|none | 0|acc |0.8346|± |0.0065|
```
## Citations
* mlabonne
* jondurbin & Replete-AI
* bartowski
* saltlux
If you use UNA models dont forget to cite:
```
@misc{unathepitbull21b,
title={ThePitbull: Uniform Neural Alignment},
author={Xavier Murias},
year={2024},
publisher = {Juanako.AI},
journal = {HuggingFace repository},
howpublished = {\url{https://huggingface.co/fblgit/UNA-ThePitbull-21.4-v1}},
}
```