KaeriJenti commited on
Commit
e28a7b2
1 Parent(s): d880245

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -3
README.md CHANGED
@@ -4,6 +4,7 @@ license: llama2
4
 
5
  <h1>Kaori-34b-v2 Model Card</h1>
6
 
 
7
 
8
  <h3>Datasets</h3>
9
 
@@ -11,9 +12,26 @@ license: llama2
11
  - Dolphin
12
  - OpenOrca
13
 
14
- We removed data similar to the evaluation dataset from the train dataset by filtering.
15
-
16
- And this Model was Finetuned By Kaeri and Jenti.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
 
19
  <h3>Framework:</h3>
 
4
 
5
  <h1>Kaori-34b-v2 Model Card</h1>
6
 
7
+ This Model was Finetuned By Kaeri and Jenti.
8
 
9
  <h3>Datasets</h3>
10
 
 
12
  - Dolphin
13
  - OpenOrca
14
 
15
+ We trained the model with <b>100%</b> Open-Platypus data, <b>5%</b> Dolphin data and <b>10%</b> OpenOrca data and applied SFT strategy.
16
+
17
+ We did not use GSM8k samples when generating data.
18
+ Also we were careful of data contamination by similarity filtering
19
+ the training data if the data correspond to any of the following list.
20
+
21
+ <pre>
22
+ filtering_tasks = [
23
+ 'cot_gsm8k',
24
+ 'cot_gsm8k_ii',
25
+ 'drop:2.0.0',
26
+ 'winogrande:1.1.0'
27
+ 'task228_arc_answer_generation_easy',
28
+ 'ai2_arc/ARC-Challenge:1.0.0',
29
+ 'ai2_arc/ARC-Easy:1.0.0',
30
+ 'task229_arc_answer_generation_hard',
31
+ 'hellaswag:1.1.0',
32
+ 'task1389_hellaswag_completion'
33
+ ]
34
+ </pre>
35
 
36
 
37
  <h3>Framework:</h3>