metadata
license: apache-2.0
language:
- vi
- en
Model Details
- Developed by: Tuan Pham (FPTU HCM Student)
- Contact me at: weekend.2810@gmail.com or tuanpmse160561@fpt.edu.vn
- Looking for intern opportunity :D
- Model type: Llama2-7B Decoder-only
- Finetuned from model :
- meta-llama/Llama-2-7b
- bkai-foundation-models/vietnamese-llama2-7b-120GB
- yeen214/llama2_7b_merge_orcafamily.
- Bilingual support : English and Vietnamese
Model Description
This model is a proof of effort that one man can fine-tune his own model to reach SOTA.
Model Sources
- Repository:
- Paper: ...
- Demo: ...
Uses
Prompt template
[SYSTEM_PROMPT]
####### Instruction:
[INPUT]
%%%%%%% Response:
[RESPONSE]
Recommend keeping the system prompt in english.
How to Get Started with the Model
Use the code below to get started with the model.
from torch.cuda.amp import autocast
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, pipeline
model_name = "1TuanPham/T-Llama"
model = AutoModelForCausalLM.from_pretrained(model_name,
torch_dtype=torch.bfloat16,
use_cache=True,
)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
streamer = TextStreamer(tokenizer, skip_special_tokens=True)
pipe = pipeline("text-generation", model=base_model, tokenizer=tokenizer, streamer=streamer)
with autocast():
output_default = pipe("Phạm Nhật Vượng là ", pad_token_id=50256, max_new_tokens=128)
Training Details
Hardware Type:
- GPU: VGA NVIDIA Tesla P100 16GB
- SYSTEM RAM: 29GB
Hours used: ~47.5 days Approx*
Training Data
- BactrianX
- OpenOrca_translated
- WizardLM_70k_translated
- TigerLabMathInstruct_translated_vi
- GradeSchoolMathInstruct_translated
- vilm_lima-vi
- MTEngVietnamese
- databricks_dolly15k_translated
- AlpacaCleaned_translated
- databricks_dolly15k
- OpenOrca
- GradeSchoolMathInstruct
- AlpacaCleaned
- WebglmQA
Training Procedure
Learning rate: 2e-5 cosine
Optimizer: PagedLion8bit
QLora: rank: 64 /Q: 4-bit
- 250k examples of 70% Vietnamese 30% English for 3.37 epoch
- 350k examples of 60% Vietnamese 40% English for 1.4 epoch
Training loss
Evaluation
Our model currently sits at TOP-5 on the VMLU benchmark
Citation
@online{t-llama,
author = {Pham Minh Tuan},
title = {T-Llama: A New Language Model for Vietnamese},
year = 2024,
url = {https://github.com/vTuanpham/Vietnamese_QA_System}
}