Quantization made by Richard Erkhov.

UNA-TheBeagle-7b-v1 - GGUF

Model creator: https://huggingface.co/fblgit/
Original model: https://huggingface.co/fblgit/UNA-TheBeagle-7b-v1/

Name	Quant method	Size
UNA-TheBeagle-7b-v1.Q2_K.gguf	Q2_K	2.53GB
UNA-TheBeagle-7b-v1.IQ3_XS.gguf	IQ3_XS	2.81GB
UNA-TheBeagle-7b-v1.IQ3_S.gguf	IQ3_S	2.96GB
UNA-TheBeagle-7b-v1.Q3_K_S.gguf	Q3_K_S	2.95GB
UNA-TheBeagle-7b-v1.IQ3_M.gguf	IQ3_M	3.06GB
UNA-TheBeagle-7b-v1.Q3_K.gguf	Q3_K	3.28GB
UNA-TheBeagle-7b-v1.Q3_K_M.gguf	Q3_K_M	3.28GB
UNA-TheBeagle-7b-v1.Q3_K_L.gguf	Q3_K_L	3.56GB
UNA-TheBeagle-7b-v1.IQ4_XS.gguf	IQ4_XS	3.67GB
UNA-TheBeagle-7b-v1.Q4_0.gguf	Q4_0	3.83GB
UNA-TheBeagle-7b-v1.IQ4_NL.gguf	IQ4_NL	3.87GB
UNA-TheBeagle-7b-v1.Q4_K_S.gguf	Q4_K_S	3.86GB
UNA-TheBeagle-7b-v1.Q4_K.gguf	Q4_K	4.07GB
UNA-TheBeagle-7b-v1.Q4_K_M.gguf	Q4_K_M	4.07GB
UNA-TheBeagle-7b-v1.Q4_1.gguf	Q4_1	4.24GB
UNA-TheBeagle-7b-v1.Q5_0.gguf	Q5_0	4.65GB
UNA-TheBeagle-7b-v1.Q5_K_S.gguf	Q5_K_S	4.65GB
UNA-TheBeagle-7b-v1.Q5_K.gguf	Q5_K	4.78GB
UNA-TheBeagle-7b-v1.Q5_K_M.gguf	Q5_K_M	4.78GB
UNA-TheBeagle-7b-v1.Q5_1.gguf	Q5_1	5.07GB
UNA-TheBeagle-7b-v1.Q6_K.gguf	Q6_K	5.53GB
UNA-TheBeagle-7b-v1.Q8_0.gguf	Q8_0	7.17GB

Original model description:

license: cc-by-nc-nd-4.0 tags: - generated_from_trainer model-index: - name: UNA-TheBeagle-7b-v1 results: [] datasets: - jondurbin/bagel-v0.3 library_name: transformers

 -- In the Love Memory of my "LoLa" --

UNA-TheBeagle-7b-v1

TheBeagle, a model of 7B parameters trained on The Bagel dataset. DPO & UNA applied over a set of curated DPO Pairs.

Scored #1 on the HF Leaderboard, dramatic scores!!! 73 ARC, and very well balanced!

The dataset was generated using the original bagel code, including the decontamination step.

As base model, we used the latest Intel's neural-chat model.

It performs very good in many tasks, but its always better that you play with it by yourself.

Evaluations

Ran with VLLM so expect them to dont be exactly as the one's shown in the board, but not too far :)

vllm (pretrained=fblgit/UNA-TheBeagle-7b-v1,dtype=auto,tensor_parallel_size=1,gpu_memory_utilization=0.8,data_parallel_size=8,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 32
|    Tasks     |Version|  Filter  |n-shot|  Metric   |Value |   |Stderr|
|--------------|-------|----------|-----:|-----------|-----:|---|-----:|
|arc_challenge |Yaml   |none      |    25|acc        |0.7090|±  |0.0133|
|              |       |none      |    25|acc_norm   |0.7329|±  |0.0129|
|gsm8k         |Yaml   |get-answer|     5|exact_match|0.7210|±  |0.0124|
|hellaswag     |Yaml   |none      |    10|acc        |0.7202|±  |0.0045|
|              |       |none      |    10|acc_norm   |0.8792|±  |0.0033|
|truthfulqa_mc2|Yaml   |none      |     0|acc        |0.7062|±  |0.0151|
|winogrande    |Yaml   |none      |     5|acc        |0.8366|±  |0.0104|

UNA Details

For this release, we only applied UNA thru the perceptrons. It was done at a 3.5e-7 speed, and the training loop code is also the original one of the bagel and transformers-4.35.2-UNA

Prompt

Im not entirely sure of it, as we used the vanilla version of the bagel training code. But a good model should be able to generalize with different prompt formats, so feel free to give it a shot.

Citations

Remember if you use UNA's models, cite it in your model card.

Limitations

Not for commercial use, and only for academic & research purposes.