File size: 1,367 Bytes
49f91d3
 
50c6d9d
9bf3e2d
50c6d9d
 
 
 
9bf3e2d
50c6d9d
 
 
 
49f91d3
 
 
 
188d7c4
ae4910e
e99991b
50c6d9d
 
 
 
 
 
5a359e7
 
50c6d9d
 
 
 
 
 
 
 
 
 
 
 
 
 
5a359e7
50c6d9d
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
---
library_name: transformers
license: apache-2.0
basemodel: beomi/llama-2-koen-13b
datasets:
- Saxo/total_ko_train_set_small_basic
- beomi/KoAlpaca-v1.1a
- kyujinpy/KOR-OpenOrca-Platypus-v2
- nlpai-lab/databricks-dolly-15k-ko
language:
- ko
- en
pipeline_tag: text-generation
---

# Model Card for Model ID

AI 와 빅데이터 분석 전문 기업인 Linkbricks의 데이터사이언티스트인 지윤성 박사(Saxo)가 beomi/llama-2-koen-13b 베이스모델을 GCC상의 A100-40G 4개를 통해 SFT 훈련을 한(2048 Tokens) 인스트럭션 모델.
 Accelerate, Deepspeed Zero-3 라이브러리를 사용했으며 Flash Attention 은 Disable  로 설정

Dr. Yunsung Ji (Saxo), a data scientist at Linkbricks, a company specializing in AI and big data analytics, trained the beomi/llama-2-koen-13b base model on 4 A100-40Gs on GCC for 4 hours of instructional training (2048 Tokens).
Accelerate, Deepspeed Zero-3 libraries were used. 

www.linkbricks.com, www.linkbricks.vc

## Configuration including BitsandBytes
---
learning_rate = 2e-4
num_epochs = 5 
batch_size = 4 
block_size = 2048 
trainer = "sft" 
warmup_ratio = 0.1 
weight_decay = 0.01 
gradient_accumulation = 4 
mixed_precision = "fp16" 
peft = True 
quantization = "int4" 
lora_r = 64 
lora_alpha = 64 
lora_dropout = 0.1 
model_max_length = 2048 
---
## Dataset Format
Alpaca Format Prompt Text