Chikuma_10.7B_v2 / README.md
sethuiyer's picture
Update README.md
2c95293 verified
|
raw
history blame
4.03 kB
---
license: apache-2.0
datasets:
- argilla/distilabel-intel-orca-dpo-pairs
base_model: sethuiyer/Chikuma_10.7B
library_name: transformers
pipeline_tag: text-generation
tags:
- dpo
---
# Chikuma_10.7B - V2 (Enhanced with DPO) [For Experiments]
<p align="center">
<img src="https://huggingface.co/sethuiyer/distilabled_Chikuma_10.7B/resolve/main/chikuma_v2.webp" height="256px" alt="Chikuma">
</p>
This model is the **DPO fine tuned version** of [Chikuma_10.7B](https://huggingface.co/sethuiyer/Chikuma_10.7B), which was a depth upscaled merge of:
* [sethuiyer/SynthIQ-7b](https://huggingface.co/sethuiyer/SynthIQ-7b)
* [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106)
The name "Chikuma" is inspired by the [Chikuma River](https://en.wikipedia.org/wiki/Shinano_River), the longest in Japan, known for its continuous flow and meandering path.
This metaphorically represents the model's depth, fluidity, and adaptability in processing and understanding language.
# Dataset used for Fine Tuning
Dataset: `/argilla/distilabel-intel-orca-dpo-pairs`
The dataset was roughly ~3000 samples but they were high quality (according to the chosen_score).
The following filters were applied to the original dataset:
```python
dataset = dataset.filter(
lambda r:
r["status"] != "tie" and
r["chosen_score"] >= 8 and
not r["in_gsm8k_train"]
)
```
# Chat Template
The chat template for Chikuma_10.7B - V2 is a modified version of ChatML, optimized for improved interaction and engagement:
```
<|im_start|>GPT4 Correct system:
{system} Always use <|end_of_turn|> when you want to end the answer. <|im_end|>
<|im_start|>GPT4 Correct user:
{user}<|im_end|>
<|im_start|>GPT4 Correct Assistant:
{asistant}<|im_end|>
```
## Nous Benchmark Evaluation
| Model | AGIEval | GPT4All | TruthfulQA | Bigbench | Average |
|-------------------------------|---------|---------|------------|----------|---------|
| SynthIQ-7b | 42.67 | 73.71 | 56.51 | 44.59 | 54.37 |
| openchat/openchat-3.5-0106 | **44.17** | 73.72 | 52.53 | 44.4 | 53.71 |
| Chikuma_10.7B | 42.41 | 73.41 | 56.69 | 43.5 | 54.00 |
| **Chikuma_10.7B_v2** | 42.77 | **73.81** | **58.83** | **44.83** | **55.06** |
# OpenLLM Leaderboard
| Benchmark Name | Performance |
|----------------|-------------|
| ARC | 66.38 |
| HellaSwag | 85 |
| MMLU | 65.27 |
| TruthfulQA | 58.83 |
| Winogrande | 78.77 |
| GSM8K | 63.68 |
| **Average** | **69.65** |
### Training Environment
- Hardware: Single A100 80GB GPU in a runpod, utilized for approximately 1.5 hours.
- Training Script: Accessible via [Google Colab Notebook](https://colab.research.google.com/drive/15iFBr1xWgztXvhrj5I9fBv20c7CFOPBE?usp=sharing). Special thanks to [mlabonne](https://huggingface.co/mlabonne) for providing the template.
## Usage
```python
# Format prompt
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(new_model)
# Create pipeline
pipeline = transformers.pipeline(
"text-generation",
model=new_model,
tokenizer=tokenizer,
device="cuda"
)
# Generate text
message = [
{"role": "system", "content": "You are a helpful assistant chatbot."},
{"role": "user", "content": "Who invented LLMs?"}
]
prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
sequences = pipeline(
prompt,
max_new_tokens=512
)
print(sequences[0]['generated_text'])
```
## Acknowledgements
A heartfelt appreciation goes to the vibrant open-source community, particularly:
* The Intel team for publishing a great open dataset and show how well it worked in the first place
* Teknium and NousResearch for their awesome work and models.
* Maxime for sharing such great resources.
* Argilla for publishing argilla/distilabel-intel-orca-dpo-pairs