--- license: cc-by-nc-4.0 datasets: - kyujinpy/KOR-OpenOrca-Platypus-v3 - beomi/KoAlpaca-v1.1a - maywell/ko_wikidata_QA language: - ko base_model: beomi/Yi-Ko-6B --- # Yi-Ko-6B-Instruct-v1.0 ## Model Details ### Base Model [beomi/Yi-Ko-6B](https://huggingface.co/beomi/Yi-Ko-6B) ### Training Dataset 1. [kyujinpy/KOR-OpenOrca-Platypus-v3](https://huggingface.co/datasets/kyujinpy/KOR-OpenOrca-Platypus-v3) 🙇 2. [beomi/KoAlpaca-v1.1a](https://huggingface.co/datasets/beomi/KoAlpaca-v1.1a) 🙇 3. [maywell/ko_wikidata_QA](https://huggingface.co/datasets/maywell/ko_wikidata_QA) 🙇 4. AIHub MRC 데이터 선별 후 Instruction Format 맞게 변경 후 사용 ## Benchmark Results ### AI-Harness Evaluation https://github.com/Beomi/ko-lm-evaluation-harness | Model | kobest_boolq | kobest_copa | kobest_hellaswag | kobest_sentineg | korunsmile | pawsx_ko | | --- | --- | --- | --- | --- | --- | --- | | | *Zero-shot* |||||| | Yi-Ko-6B-Instruct-v1.0 | 0.6619 | 0.7794 | 0.4858 | 0.4589 | 0.3520 | 0.5545 | | Yi-Ko-6B | 0.7070 | 0.7696 | 0.5009 | 0.4044 | 0.3828 | 0.5145 | ## Instruction Format ```python ### User: {instruction} ### Assistant: {response} ``` ## Loading the Model ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("wkshin89/Yi-Ko-6B-Instruct-v1.0") model = AutoModelForCausalLM.from_pretrained( "wkshin89/Yi-Ko-6B-Instruct-v1.0", device_map="auto", torch_dtype=torch.bfloat16, ) ```