Edit model card

Model Card for Model ID

AI 와 빅데이터 분석 전문 기업인 Linkbricks의 데이터사이언티스트인 지윤성 박사(Saxo)가 beomi/llama-2-koen-13b 베이스모델을 GCP상의 A100-40G 4개를 통해 SFT 훈련을 한(2048 Tokens) 인스트럭션 모델. Accelerate, Deepspeed Zero-3 라이브러리를 사용했으며 Flash Attention 은 Disable 로 설정

Dr. Yunsung Ji (Saxo), a data scientist at Linkbricks, a company specializing in AI and big data analytics, trained the beomi/llama-2-koen-13b base model on 4 A100-40Gs on GCP for 4 hours of instructional training (2048 Tokens). Accelerate, Deepspeed Zero-3 libraries were used.

www.linkbricks.com, www.linkbricks.vc

Configuration including BitsandBytes


learning_rate = 2e-4 num_epochs = 5 batch_size = 4 block_size = 2048 trainer = "sft" warmup_ratio = 0.1 weight_decay = 0.01 gradient_accumulation = 4 mixed_precision = "fp16" peft = True quantization = "int4" lora_r = 64 lora_alpha = 16 lora_dropout = 0.1 model_max_length = 2048

Dataset Format

Alpaca Format Prompt Text

Downloads last month
1,733
Safetensors
Model size
13.2B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train Saxo/yunsung-llama-2-koen-13b-linkbricks-sft-basic-v1