kyujinpy
/

KoT-platypus2-13B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

kyujinpy commited on Oct 5, 2023

Commit

d6eb85d

•

1 Parent(s): 6d1d9f4

Upload README.md

Files changed (1) hide show

README.md +83 -0

README.md CHANGED Viewed

@@ -1,3 +1,86 @@
 ---
 license: cc-by-nc-4.0
 ---

 ---
+language:
+- ko
+datasets:
+- kyujinpy/KoCoT_2000
+library_name: transformers
+pipeline_tag: text-generation
 license: cc-by-nc-4.0
 ---
+# **KoT-platypus2**
+![img](./KoT-platypus2.png)
+**CoT + KO-platypus2 = KoT-platypus2**
+## Model Details
+**Model Developers** Kyujin Han (kyujinpy)
+**Input** Models input text only.
+**Output** Models generate text only.
+**Model Architecture**
+KoT-platypus2-13B is an auto-regressive language model based on the LLaMA2 transformer architecture.
+**Repo Link**
+Github KoT-platypus: [KoT-platypus2](https://github.com/KyujinHan/KoT-platypus)
+**Base Model**
+[KO-Platypus2-7B-ex](https://huggingface.co/kyujinpy/KO-Platypus2-7B-ex)
+More detail repo(Github): [CoT-llama2](https://github.com/Marker-Inc-Korea/CoT-llama2)
+More detail repo(Github): [KO-Platypus2](https://github.com/Marker-Inc-Korea/KO-Platypus)
+**Training Dataset**
+I use [KoCoT_2000](https://huggingface.co/datasets/kyujinpy/KoCoT_2000).
+Using DeepL, translate about [kaist-CoT](https://huggingface.co/datasets/kaist-ai/CoT-Collection).
+I use A100 GPU 40GB and COLAB, when trianing.
+**Training Hyperparameters**
+| Hyperparameters | Value |
+| --- | --- |
+| batch_size | `64` |
+| micro_batch_size | `1` |
+| Epochs | `15` |
+| learning_rate | `1e-5` |
+| cutoff_len | `4096` |
+| lr_scheduler | `linear` |
+| base_model | `kyujinpy/KO-Platypus2-13B` |
+# **Model Benchmark**
+## KO-LLM leaderboard
+- Follow up as [Open KO-LLM LeaderBoard](https://huggingface.co/spaces/upstage/open-ko-llm-leaderboard).
+![img](./leaderboard.png)
+| Model | Average |Ko-ARC | Ko-HellaSwag | Ko-MMLU | Ko-TruthfulQA | Ko-CommonGen V2 |
+| --- | --- | --- | --- | --- | --- | --- |
+| KoT-Platypus2-13B(ours) | NaN | NaN | NaN | NaN | NaN | NaN |
+| [hyunseoki/ko-en-llama2-13b](https://huggingface.co/hyunseoki/ko-en-llama2-13b) | 46.68 | 42.15 | 54.23 | 38.90 | 40.74 | 57.39 |
+| [CoTy-platypus-ko-12.8b](https://huggingface.co/MarkrAI/kyujin-CoTy-platypus-ko-12.8b) | 46.44 | 34.98 | 49.11 | 25.68 | 37.59 | 84.86 |
+| [momo/polyglot-ko-12.8b-Chat-QLoRA-Merge](https://huggingface.co/momo/polyglot-ko-12.8b-Chat-QLoRA-Merge) | 45.71 | 35.49 | 49.93 | 25.97 | 39.43 | 77.70 |
+| [KoT-platypus2-7B](https://huggingface.co/kyujinpy/KoT-platypus2-7B) | 45.62 | 38.05 | 49.63 | 34.68 | 37.69 | 68.08 |
+> Compare with Top 4 SOTA models. (update: 10/05)
+# Implementation Code
+```python
+### KO-Platypus
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+repo = "kyujinpy/KoT-platypus2-13B"
+CoT-llama = AutoModelForCausalLM.from_pretrained(
+        repo,
+        return_dict=True,
+        torch_dtype=torch.float16,
+        device_map='auto'
+)
+CoT-llama_tokenizer = AutoTokenizer.from_pretrained(repo)
+```
+> Readme format: [beomi/llama-2-ko-7b](https://huggingface.co/beomi/llama-2-ko-7b)
+---