README.md · URP/urllm-ko_en-2.7b at a5afa1c98726fd474a7687c847753c0a180226a2

metadata

license: cc-by-sa-4.0
language:
  - ko
  - en
pipeline_tag: text-generation
tags:
  - meta
  - llama-2
  - llama-2-ko-en
  - sheared llama

Model Details

Model Architecture:

urLLM-KO_EN-2.7B is an auto-regressive language model that leverages an optimized transformer architecture derived from princeton-nlp/Sheared-LLaMA-2.7B.

Training Corpus

The model was trained using selected datasets from Modu Corpus, Korean Wikipedia and Kaggle English News (approximately total 36GB).

Vocab Expansion

The expanded vocab size is 51385.

Model Card Contact

For errors or additional questions about details in this model card, contact pkchae@urp.kr .