sethuiyer commited on
Commit
0ccd03f
1 Parent(s): f6480b6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -0
README.md ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - argilla/distilabel-intel-orca-dpo-pairs
5
+ library_name: transformers
6
+ pipeline_tag: text-generation
7
+ ---
8
+
9
+ # Chikuma_10.7B - V2
10
+
11
+ This model is the DPO fine tune of [Chikuma_10.7B](https://huggingface.co/sethuiyer/Chikuma_10.7B) using [argilla/distilabel-intel-orca-dpo-pairs](https://huggingface.co/datasets/argilla/distilabel-intel-orca-dpo-pairs)
12
+
13
+ # Dataset
14
+ Dataset: `/argilla/distilabel-intel-orca-dpo-pairs`
15
+
16
+ The dataset was roughly ~3000 samples but they were high quality (according to the chosen_score).
17
+ The following filters were applied to the original dataset:
18
+ ```python
19
+ dataset = dataset.filter(
20
+ lambda r:
21
+ r["status"] != "tie" and
22
+ r["chosen_score"] >= 8 and
23
+ not r["in_gsm8k_train"]
24
+ )
25
+ ```
26
+
27
+ # Chat Template
28
+ I decided to go with a slight modification of ChatML.
29
+
30
+ ```
31
+ <|im_start|>GPT4 Correct system:
32
+ {system} Always use <|end_of_turn|> when you want to end the answer. <|im_end|>
33
+ <|im_start|>GPT4 Correct user:
34
+ {user}<|im_end|>
35
+ <|im_start|>GPT4 Correct Assistant:
36
+ {asistant}<|im_end|>
37
+ ```
38
+
39
+ ### Training Hardware
40
+
41
+ I used 1 x A100 80GB in runpod for about 1.5 hours.
42
+
43
+ ## Usage
44
+
45
+ ```python
46
+ # Format prompt
47
+ from transformers import AutoModelForCausalLM, AutoTokenizer
48
+ tokenizer = AutoTokenizer.from_pretrained(new_model)
49
+
50
+ # Create pipeline
51
+ pipeline = transformers.pipeline(
52
+ "text-generation",
53
+ model=new_model,
54
+ tokenizer=tokenizer,
55
+ device="cuda"
56
+ )
57
+
58
+ # Generate text
59
+
60
+ message = [
61
+ {"role": "system", "content": "You are a helpful assistant chatbot. Always use <|end_of_turn|> when you want to end the answer."},
62
+ {"role": "user", "content": "What is large language model?"}
63
+ ]
64
+
65
+ prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
66
+
67
+ sequences = pipeline(
68
+ prompt,
69
+ do_sample=True,
70
+ temperature=0.7,
71
+ top_p=0.9,
72
+ num_return_sequences=1,
73
+ max_length=512,
74
+ )
75
+ print(sequences[0]['generated_text'])
76
+ ```
77
+
78
+ ## Things in Pipeline:
79
+ 1. Manual Testing and Evaluation against GPT-4 on text-generation-webui across 45 sample complex prompts.
80
+ 2. Nous Benchmark
81
+ 3. GGUF Format
82
+ 4. Ollama Model (if model benchmarks are good)
83
+
84
+ ## Acknowledgements
85
+
86
+ I'd like to thank the amazing open community and in particular:
87
+
88
+ * The Intel team for publishing a great open dataset and show how well it worked in the first place
89
+ * Teknium and NousResearch for their awesome work and models.
90
+ * Maxime for sharing such great resources.
91
+ * Argilla for publishing argilla/distilabel-intel-orca-dpo-pairs
92
+
93
+