File size: 3,639 Bytes
9255e5c
 
 
 
 
 
 
442a193
 
 
9255e5c
 
442a193
9255e5c
 
 
c104028
 
 
 
9255e5c
 
442a193
 
 
 
f96f4cb
442a193
9255e5c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
efd737b
9255e5c
 
 
 
 
 
 
5359d7e
9255e5c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f57ce0a
9255e5c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
96709dd
9255e5c
 
 
96709dd
9255e5c
 
 
 
 
652f4a4
9255e5c
652f4a4
9255e5c
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
---
license: apache-2.0
language:
- vi
- en
---

<p align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/63905e87df447b438817b2cd/QFhLKQlWeyO9XumtyghVo.jpeg" alt="Image" style="width: 400px; height: auto; border-radius: 10px;" />
</p>


## Model Details

- **Developed by:** Tuan Pham (FPTU HCM Student)
- **Model type:** Llama2-7B Decoder-only
- **Finetuned from model :**
  * meta-llama/Llama-2-7b
  * bkai-foundation-models/vietnamese-llama2-7b-120GB
  * yeen214/llama2_7b_merge_orcafamily.
- **Bilingual support :** English and Vietnamese

### Model Description

<!-- Provide a longer summary of what this model is. -->

This model is a proof of effort that one man can fine-tune his own model to reach SOTA.

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** 
  * Training: https://github.com/vTuanpham/Vietnamese_QA_System
  * Data: https://github.com/vTuanpham/Large_dataset_translator
- **Paper:** ...
- **Demo:** ...

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Prompt template

```
[SYSTEM_PROMPT]

 ####### Instruction:
[INPUT]

 %%%%%%% Response:
[RESPONSE]
```
Recommend keeping the system prompt in english.
## How to Get Started with the Model

Use the code below to get started with the model.
```python
from torch.cuda.amp import autocast
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, pipeline

model_name = "1TuanPham/T-Llama"
model = AutoModelForCausalLM.from_pretrained(model_name,
                                             torch_dtype=torch.bfloat16,
                                             use_cache=True,
                                             )
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
streamer = TextStreamer(tokenizer, skip_special_tokens=True)
pipe = pipeline("text-generation", model=base_model, tokenizer=tokenizer, streamer=streamer)

with autocast():
  output_default = pipe("Phạm Nhật Vượng là ", pad_token_id=50256, max_new_tokens=128)

```
## Training Details

**Hardware Type:**
  * GPU: VGA NVIDIA Tesla P100 16GB
  * SYSTEM RAM: 29GB
  
**Hours used:** ~47.5 days Approx*

### Training Data

* BactrianX 
* OpenOrca_translated 
* WizardLM_70k_translated 
* TigerLabMathInstruct_translated_vi 
* GradeSchoolMathInstruct_translated 
* vilm_lima-vi
* MTEngVietnamese 
* databricks_dolly15k_translated 
* AlpacaCleaned_translated 
* databricks_dolly15k
* OpenOrca
* GradeSchoolMathInstruct 
* AlpacaCleaned
* WebglmQA

### Training Procedure

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

* Learning rate: 2e-5 cosine
* Optimizer: PagedLion8bit
* QLora: rank: 64 /Q: 4-bit
  
  - 250k examples of 70% Vietnamese 30% English for 3.37 epoch
  - 350k examples of 60% Vietnamese 40% English for 1.4 epoch

### Training loss

![image/png](https://cdn-uploads.huggingface.co/production/uploads/63905e87df447b438817b2cd/rV8Go_YFZv7QcR_FhFxp-.png)

## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

![image/png](https://cdn-uploads.huggingface.co/production/uploads/63905e87df447b438817b2cd/z1ZTm7Tab4tQbVPgQW1hU.png)

Our model currently sits at TOP-5 on the VMLU benchmark

## Citation

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

## Model Card Authors


## Model Card Contact

[More Information Needed]