Chikuma_10.7B_v2 / README.md

Update README.md

2c95293 verified 9 months ago

4.03 kB

	---
	license: apache-2.0
	datasets:
	- argilla/distilabel-intel-orca-dpo-pairs
	base_model: sethuiyer/Chikuma_10.7B
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- dpo

	---

	# Chikuma_10.7B - V2 (Enhanced with DPO) [For Experiments]

	<p align="center">
	<img src="https://huggingface.co/sethuiyer/distilabled_Chikuma_10.7B/resolve/main/chikuma_v2.webp" height="256px" alt="Chikuma">
	</p>


	This model is the DPO fine tuned version of [Chikuma_10.7B](https://huggingface.co/sethuiyer/Chikuma_10.7B), which was a depth upscaled merge of:
	* [sethuiyer/SynthIQ-7b](https://huggingface.co/sethuiyer/SynthIQ-7b)
	* [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106)

	The name "Chikuma" is inspired by the [Chikuma River](https://en.wikipedia.org/wiki/Shinano_River), the longest in Japan, known for its continuous flow and meandering path.
	This metaphorically represents the model's depth, fluidity, and adaptability in processing and understanding language.


	# Dataset used for Fine Tuning
	Dataset: `/argilla/distilabel-intel-orca-dpo-pairs`

	The dataset was roughly ~3000 samples but they were high quality (according to the chosen_score).

	The following filters were applied to the original dataset:
	```python
	dataset = dataset.filter(
	lambda r:
	r["status"] != "tie" and
	r["chosen_score"] >= 8 and
	not r["in_gsm8k_train"]
	)
	```

	# Chat Template
	The chat template for Chikuma_10.7B - V2 is a modified version of ChatML, optimized for improved interaction and engagement:

	```
	<\|im_start\|>GPT4 Correct system:
	{system} Always use <\|end_of_turn\|> when you want to end the answer. <\|im_end\|>
	<\|im_start\|>GPT4 Correct user:
	{user}<\|im_end\|>
	<\|im_start\|>GPT4 Correct Assistant:
	{asistant}<\|im_end\|>
	```

	## Nous Benchmark Evaluation
	\| Model \| AGIEval \| GPT4All \| TruthfulQA \| Bigbench \| Average \|
	\|-------------------------------\|---------\|---------\|------------\|----------\|---------\|
	\| SynthIQ-7b \| 42.67 \| 73.71 \| 56.51 \| 44.59 \| 54.37 \|
	\| openchat/openchat-3.5-0106 \| 44.17 \| 73.72 \| 52.53 \| 44.4 \| 53.71 \|
	\| Chikuma_10.7B \| 42.41 \| 73.41 \| 56.69 \| 43.5 \| 54.00 \|
	\| Chikuma_10.7B_v2 \| 42.77 \| 73.81 \| 58.83 \| 44.83 \| 55.06 \|

	# OpenLLM Leaderboard

	\| Benchmark Name \| Performance \|
	\|----------------\|-------------\|
	\| ARC \| 66.38 \|
	\| HellaSwag \| 85 \|
	\| MMLU \| 65.27 \|
	\| TruthfulQA \| 58.83 \|
	\| Winogrande \| 78.77 \|
	\| GSM8K \| 63.68 \|
	\| Average \| 69.65 \|


	### Training Environment
	- Hardware: Single A100 80GB GPU in a runpod, utilized for approximately 1.5 hours.
	- Training Script: Accessible via [Google Colab Notebook](https://colab.research.google.com/drive/15iFBr1xWgztXvhrj5I9fBv20c7CFOPBE?usp=sharing). Special thanks to [mlabonne](https://huggingface.co/mlabonne) for providing the template.


	## Usage

	```python
	# Format prompt
	from transformers import AutoModelForCausalLM, AutoTokenizer
	tokenizer = AutoTokenizer.from_pretrained(new_model)

	# Create pipeline
	pipeline = transformers.pipeline(
	"text-generation",
	model=new_model,
	tokenizer=tokenizer,
	device="cuda"
	)

	# Generate text

	message = [
	{"role": "system", "content": "You are a helpful assistant chatbot."},
	{"role": "user", "content": "Who invented LLMs?"}
	]

	prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)

	sequences = pipeline(
	prompt,
	max_new_tokens=512
	)
	print(sequences[0]['generated_text'])
	```

	## Acknowledgements

	A heartfelt appreciation goes to the vibrant open-source community, particularly:

	* The Intel team for publishing a great open dataset and show how well it worked in the first place
	* Teknium and NousResearch for their awesome work and models.
	* Maxime for sharing such great resources.
	* Argilla for publishing argilla/distilabel-intel-orca-dpo-pairs