KaeriJenti
/

Kaori-34b-v2

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

KaeriJenti commited on Dec 21, 2023

Commit

e28a7b2

•

1 Parent(s): d880245

Update README.md

Files changed (1) hide show

README.md +21 -3

README.md CHANGED Viewed

@@ -4,6 +4,7 @@ license: llama2
 <h1>Kaori-34b-v2  Model Card</h1>
 <h3>Datasets</h3>
@@ -11,9 +12,26 @@ license: llama2
  - Dolphin
  - OpenOrca
-We removed data similar to the evaluation dataset from the train dataset by filtering.
-And this Model was Finetuned By Kaeri and Jenti.
 <h3>Framework:</h3>

 <h1>Kaori-34b-v2  Model Card</h1>
+This Model was Finetuned By Kaeri and Jenti.
 <h3>Datasets</h3>
  - Dolphin
  - OpenOrca
+We trained the model with <b>100%</b> Open-Platypus data, <b>5%</b> Dolphin data and <b>10%</b> OpenOrca data and applied SFT strategy.
+We did not use GSM8k samples when generating data.
+Also we were careful of data contamination by similarity filtering
+the training data if the data correspond to any of the following list.
+<pre>
+filtering_tasks = [
+    'cot_gsm8k',
+    'cot_gsm8k_ii',
+    'drop:2.0.0',
+    'winogrande:1.1.0'
+    'task228_arc_answer_generation_easy',
+    'ai2_arc/ARC-Challenge:1.0.0',
+    'ai2_arc/ARC-Easy:1.0.0',
+    'task229_arc_answer_generation_hard',
+    'hellaswag:1.1.0',
+    'task1389_hellaswag_completion'
+]
+</pre>
 <h3>Framework:</h3>