metadata
license: cc-by-sa-4.0
language:
- ko
- en
pipeline_tag: text-generation
tags:
- meta
- llama-2
- llama-2-ko-en
- sheared llama
Model Details
Model Architecture:
urLLM-KO_EN-2.7B is an auto-regressive language model that leverages an optimized transformer architecture derived from princeton-nlp/Sheared-LLaMA-2.7B.
Training Corpus
The model was trained using selected datasets from Modu Corpus, Korean Wikipedia and Kaggle English News (approximately total 36GB).
Vocab Expansion
The expanded vocab size is 51385.