Taekyoon
/

yiko-12b-mu

Model card Files Files and versions Community

yiko-12b-mu / README.md

Taekyoon's picture

Update README.md

b19e138 verified 8 months ago

|

history blame contribute delete

417 Bytes

metadata

license: cc-by-nc-sa-4.0

Experiment Objectives

Is Training with Korean + Multi-lingual dataset helpful to perform Korean benchmarks?
Does Full Parameter Depth-Up Scaled Training (expansion method: Llama-Pro) help to perform the best Korean benchmark performance?

Methods

Training CJK + En + Glot dataset with the same ratio of data size.
Layer Expansion and full parameter training.