metadata
base_model: appvoid/palmer-x-002
datasets:
- appvoid/no-prompt-15k
inference: false
language:
- en
license: apache-2.0
model_creator: appvoid
model_name: palmer-x-002
pipeline_tag: text-generation
quantized_by: afrideva
tags:
- gguf
- ggml
- quantized
- q2_k
- q3_k_m
- q4_k_m
- q5_k_m
- q6_k
- q8_0
appvoid/palmer-x-002-GGUF
Quantized GGUF model files for palmer-x-002 from appvoid
Name | Quant method | Size |
---|---|---|
palmer-x-002.fp16.gguf | fp16 | 2.20 GB |
palmer-x-002.q2_k.gguf | q2_k | 483.12 MB |
palmer-x-002.q3_k_m.gguf | q3_k_m | 550.82 MB |
palmer-x-002.q4_k_m.gguf | q4_k_m | 668.79 MB |
palmer-x-002.q5_k_m.gguf | q5_k_m | 783.02 MB |
palmer-x-002.q6_k.gguf | q6_k | 904.39 MB |
palmer-x-002.q8_0.gguf | q8_0 | 1.17 GB |
Original Model Card:
x-002
This is an incremental model update on palmer-002
using dpo technique. X means dpo+sft spinoff.
evaluation
Model | ARC_C | HellaSwag | PIQA | Winogrande |
---|---|---|---|---|
tinyllama-2t | 0.2807 | 0.5463 | 0.7067 | 0.5683 |
palmer-001 | 0.2807 | 0.5524 | 0.7106 | 0.5896 |
tinyllama-2.5t | 0.3191 | 0.5896 | 0.7307 | 0.5872 |
palmer-002 | 0.3242 | 0.5956 | 0.7345 | 0.5888 |
palmer-x-002 | 0.3224 | 0.5941 | 0.7383 | 0.5912 |
training
~500 dpo samples as experimental data to check on improvements. It seems like data is making it better on some benchmarks while also degrading quality on others.
prompt
no prompt
As you can notice, the model actually completes by default questions that are the most-likely to be asked, which is good because most people will use it to answer as a chatbot.