File size: 3,639 Bytes

---
license: apache-2.0
language:
- vi
- en
---

<p align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/63905e87df447b438817b2cd/QFhLKQlWeyO9XumtyghVo.jpeg" alt="Image" style="width: 400px; height: auto; border-radius: 10px;" />
</p>


## Model Details

- **Developed by:** Tuan Pham (FPTU HCM Student)
- **Model type:** Llama2-7B Decoder-only
- **Finetuned from model :**
  * meta-llama/Llama-2-7b
  * bkai-foundation-models/vietnamese-llama2-7b-120GB
  * yeen214/llama2_7b_merge_orcafamily.
- **Bilingual support :** English and Vietnamese

### Model Description

<!-- Provide a longer summary of what this model is. -->

This model is a proof of effort that one man can fine-tune his own model to reach SOTA.

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** 
  * Training: https://github.com/vTuanpham/Vietnamese_QA_System
  * Data: https://github.com/vTuanpham/Large_dataset_translator
- **Paper:** ...
- **Demo:** ...

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Prompt template

```
[SYSTEM_PROMPT]

 ####### Instruction:
[INPUT]

 %%%%%%% Response:
[RESPONSE]
```
Recommend keeping the system prompt in english.
## How to Get Started with the Model

Use the code below to get started with the model.
```python
from torch.cuda.amp import autocast
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, pipeline

model_name = "1TuanPham/T-Llama"
model = AutoModelForCausalLM.from_pretrained(model_name,
                                             torch_dtype=torch.bfloat16,
                                             use_cache=True,
                                             )
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
streamer = TextStreamer(tokenizer, skip_special_tokens=True)
pipe = pipeline("text-generation", model=base_model, tokenizer=tokenizer, streamer=streamer)

with autocast():
  output_default = pipe("Phạm Nhật Vượng là ", pad_token_id=50256, max_new_tokens=128)

```
## Training Details

**Hardware Type:**
  * GPU: VGA NVIDIA Tesla P100 16GB
  * SYSTEM RAM: 29GB
  
**Hours used:** ~47.5 days Approx*

### Training Data

* BactrianX 
* OpenOrca_translated 
* WizardLM_70k_translated 
* TigerLabMathInstruct_translated_vi 
* GradeSchoolMathInstruct_translated 
* vilm_lima-vi
* MTEngVietnamese 
* databricks_dolly15k_translated 
* AlpacaCleaned_translated 
* databricks_dolly15k
* OpenOrca
* GradeSchoolMathInstruct 
* AlpacaCleaned
* WebglmQA

### Training Procedure

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

* Learning rate: 2e-5 cosine
* Optimizer: PagedLion8bit
* QLora: rank: 64 /Q: 4-bit
  
  - 250k examples of 70% Vietnamese 30% English for 3.37 epoch
  - 350k examples of 60% Vietnamese 40% English for 1.4 epoch

### Training loss

![image/png](https://cdn-uploads.huggingface.co/production/uploads/63905e87df447b438817b2cd/rV8Go_YFZv7QcR_FhFxp-.png)

## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

![image/png](https://cdn-uploads.huggingface.co/production/uploads/63905e87df447b438817b2cd/z1ZTm7Tab4tQbVPgQW1hU.png)

Our model currently sits at TOP-5 on the VMLU benchmark

## Citation

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

## Model Card Authors


## Model Card Contact

[More Information Needed]