sethuiyer
/

Chikuma_10.7B_v2

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

sethuiyer commited on Jan 13

Commit

0ccd03f

•

1 Parent(s): f6480b6

Create README.md

Files changed (1) hide show

README.md +93 -0

README.md ADDED Viewed

	@@ -0,0 +1,93 @@

+---
+license: apache-2.0
+datasets:
+- argilla/distilabel-intel-orca-dpo-pairs
+library_name: transformers
+pipeline_tag: text-generation
+---
+# Chikuma_10.7B - V2
+This model is the DPO fine tune of [Chikuma_10.7B](https://huggingface.co/sethuiyer/Chikuma_10.7B) using [argilla/distilabel-intel-orca-dpo-pairs](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs)
+# Dataset
+Dataset: `/argilla/distilabel-intel-orca-dpo-pairs`
+The dataset was roughly ~3000 samples but they were high quality (according to the chosen_score).
+The following filters were applied to the original dataset:
+```python
+dataset = dataset.filter(
+    lambda r:
+        r["status"] != "tie" and
+        r["chosen_score"] >= 8 and
+        not r["in_gsm8k_train"]
+)
+```
+# Chat Template
+I decided to go with a slight modification of ChatML.
+```
+<|im_start|>GPT4 Correct system:
+{system} Always use <|end_of_turn|> when you want to end the answer. <|im_end|>
+<|im_start|>GPT4 Correct user:
+{user}<|im_end|>
+<|im_start|>GPT4 Correct Assistant:
+{asistant}<|im_end|>
+```
+### Training Hardware
+I used 1 x A100 80GB in runpod for about 1.5 hours.
+## Usage
+```python
+# Format prompt
+from transformers import AutoModelForCausalLM, AutoTokenizer
+tokenizer = AutoTokenizer.from_pretrained(new_model)
+# Create pipeline
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=new_model,
+    tokenizer=tokenizer,
+    device="cuda"
+)
+# Generate text
+message = [
+    {"role": "system", "content": "You are a helpful assistant chatbot. Always use <|end_of_turn|> when you want to end the answer."},
+    {"role": "user", "content": "What is large language model?"}
+]
+prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
+sequences = pipeline(
+    prompt,
+    do_sample=True,
+    temperature=0.7,
+    top_p=0.9,
+    num_return_sequences=1,
+    max_length=512,
+)
+print(sequences[0]['generated_text'])
+```
+## Things in Pipeline:
+1. Manual Testing and Evaluation against GPT-4 on text-generation-webui across 45 sample complex prompts.
+2. Nous Benchmark
+3. GGUF Format
+4. Ollama Model (if model benchmarks are good)
+## Acknowledgements
+I'd like to thank the amazing open community and in particular:
+* The Intel team for publishing a great open dataset and show how well it worked in the first place
+* Teknium and NousResearch for their awesome work and models.
+* Maxime for sharing such great resources.
+* Argilla for publishing argilla/distilabel-intel-orca-dpo-pairs