jjae's picture
Update README.md
23aee3b verified
metadata
license: mit
language:
  - ko
base_model:
  - K-intelligence/Midm-2.0-Base-Instruct
tags:
  - Korean
  - Culture

Midm-KCulture-2.0-Base-Instruct

  • This model is fine-tuned from KT/Midm-2.0-Base-Instruct on the 'Korean Culture Q&A Corpus' using the LoRA (Low-Rank Adaptation) methodology.

GitHub

Check out the full training code here.

Training Hyperparameters

Hyperparameter Value
SFTConfig
torch_dtype bfloat16
seed 42
epoch 3
per_device_train_batch_size 2
per_device_eval_batch_size 2
learning_rate 0.0002
lr_scheduler_type "linear"
max_grad_norm 1.0
neftune_noise_alpha None
gradient_accumulation_steps 1
gradient_checkpointing False
max_seq_length 1024
LoraConfig
r 16
lora_alpha 16
lora_dropout 0.1
target_modules ["q_proj", "v_proj"]

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "jjae/Midm-KCulture-2.0-Base-Instruct"
model = AutoModelForCausalLM.from_pretrained(
      model_name,
      torch_dtype=torch.bfloat16,
      trust_remote_code=True,
      device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)