KillerShoaib commited on
Commit
8547d2a
1 Parent(s): 434f92d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +107 -4
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  language:
3
- - en
4
  license: apache-2.0
5
  tags:
6
  - text-generation-inference
@@ -9,14 +9,117 @@ tags:
9
  - llama
10
  - trl
11
  base_model: unsloth/llama-3-8b-bnb-4bit
 
12
  ---
13
 
14
- # Uploaded model
 
 
 
 
 
15
 
16
  - **Developed by:** KillerShoaib
17
  - **License:** apache-2.0
18
  - **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
 
 
 
 
 
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
1
  ---
2
  language:
3
+ - bn
4
  license: apache-2.0
5
  tags:
6
  - text-generation-inference
 
9
  - llama
10
  - trl
11
  base_model: unsloth/llama-3-8b-bnb-4bit
12
+ inference: false
13
  ---
14
 
15
+ # LLama-3 Bangla 4 bit
16
+
17
+ <div align="center">
18
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/65ca6f0098a46a56261ac3ac/O1ATwhQt_9j59CSIylrVS.png" width="300"/>
19
+
20
+ </div>
21
 
22
  - **Developed by:** KillerShoaib
23
  - **License:** apache-2.0
24
  - **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit
25
+ - **Datset used for fine-tuning :** iamshnoo/alpaca-cleaned-bengali
26
+
27
+
28
+ # 4-bit Quantization
29
+ **This is 4-bit quantization of Llama-3 8b model.**
30
+
31
+ # Model Details
32
+
33
+ **Llama 3 8 billion** model was finetuned using **unsloth** package on a **cleaned Bangla alpaca** dataset. After that the model was quantized in **4-bit**. The model is finetuned for **2 epoch** on a single T4 GPU.
34
+
35
+
36
+ # Pros & Cons of the Model
37
+
38
+ ## Pros
39
+
40
+ - **The model can comprehend the Bangla language, including its semantic nuances**
41
+ - **Given context model can answer the question based on the context**
42
+
43
+ ## Cons
44
+ - **Model is unable to do creative or complex work. i.e: creating a poem or solving a math problem in Bangla**
45
+ - **Since the size of the dataset was small, the model lacks lot of general knowledge in Bangla**
46
+
47
+
48
+ # Run The Model
49
+
50
+ ## FastLanguageModel from unsloth for 2x faster inference
51
+
52
+ ```python
53
+
54
+ from unsloth import FastLanguageModel
55
+ model, tokenizer = FastLanguageModel.from_pretrained(
56
+ model_name = "KillerShoaib/llama-3-8b-bangla-4bit",
57
+ max_seq_length = 2048,
58
+ dtype = None,
59
+ load_in_4bit = True,
60
+ )
61
+ FastLanguageModel.for_inference(model)
62
+
63
+ # alpaca_prompt for the model
64
+ alpaca_prompt = """Below is an instruction in bangla that describes a task, paired with an input also in bangla that provides further context. Write a response in bangla that appropriately completes the request.
65
+
66
+ ### Instruction:
67
+ {}
68
+
69
+ ### Input:
70
+ {}
71
+
72
+ ### Response:
73
+ {}"""
74
+
75
+ # input with instruction and input
76
+ inputs = tokenizer(
77
+ [
78
+ alpaca_prompt.format(
79
+ "সুস্থ থাকার তিনটি উপায় বলুন", # instruction
80
+ "", # input
81
+ "", # output - leave this blank for generation!
82
+ )
83
+ ], return_tensors = "pt").to("cuda")
84
+
85
+ # generating the output and decoding it
86
+ outputs = model.generate(**inputs, max_new_tokens = 2048, use_cache = True)
87
+ tokenizer.batch_decode(outputs)
88
+ ```
89
+
90
+ ## AutoModelForCausalLM from Hugginface
91
+
92
+ ```python
93
+ from transformers import AutoTokenizer, AutoModelForCausalLM
94
+
95
+ model_name = "KillerShoaib/llama-3-8b-bangla-4bit" # YOUR MODEL YOU USED FOR TRAINING either hf hub name or local folder name.
96
+ tokenizer_name = model_name
97
+
98
+ # Load tokenizer
99
+ tokenizer = AutoTokenizer.from_pretrained(tokenizer_name)
100
+ # Load model
101
+ model = AutoModelForCausalLM.from_pretrained(model_name)
102
+
103
+ alpaca_prompt = """Below is an instruction in bangla that describes a task, paired with an input also in bangla that provides further context. Write a response in bangla that appropriately completes the request.
104
+
105
+ ### Instruction:
106
+ {}
107
+
108
+ ### Input:
109
+ {}
110
+
111
+ ### Response:
112
+ {}"""
113
 
114
+ inputs = tokenizer(
115
+ [
116
+ alpaca_prompt.format(
117
+ "সুস্থ থাকার তিনটি উপায় বলুন", # instruction
118
+ "", # input
119
+ "", # output - leave this blank for generation!
120
+ )
121
+ ], return_tensors = "pt").to("cuda")
122
 
123
+ outputs = model.generate(**inputs, max_new_tokens = 1024, use_cache = True)
124
+ tokenizer.batch_decode(outputs)
125
+ ```