lumaticai commited on
Commit
b0307fe
1 Parent(s): 80f6874

Delete Readme.md

Browse files
Files changed (1) hide show
  1. Readme.md +0 -314
Readme.md DELETED
@@ -1,314 +0,0 @@
1
- ---
2
-
3
- ---
4
- <style>
5
- img{
6
- width: 45vw;
7
- height: 45vh;
8
- margin: 0 auto;
9
- display: flex;
10
- align-items: center;
11
- justify-content: center;
12
- }
13
- </style>
14
-
15
- # lumaticai/BongLlama-1.1B-Chat-alpha-v0
16
-
17
- Introducing BongLlama by LumaticAI. A finetuned version of TinyLlama 1.1B Chat on Bengali Dataset.
18
-
19
-
20
- <img class="custom-image" src="bong_llama.png" alt="BongLlama">
21
-
22
-
23
- # Model Details
24
-
25
- ## Model Description
26
-
27
- Bongllama is a sub-part of our company&#39;s initiative for developing Indic and Regional Large Language Models. We are LumaticAI continuously working on helping our clients build Custom AI Solutions for their organization.
28
- We have taken an initiative to launch open source models specific to regions and languages.
29
-
30
- Bongllama is a LLM built for West Bengal on Bengali dataset. It&#39;s a 1.1B parameters model. We have used a Bengali dataset of 10k data i.e lumatic-ai/BongChat-10k-v0 and finetuned on TinyLlama/TinyLlama-1.1B-Chat-v1.0 model to get our BongLlama 1.1B Chat Alpha v0 model.
31
-
32
- We are continuously working on training and developing this model and improve it. We are also going to launch this model with various sizes of different LLM&#39;s and Datasets.
33
-
34
- - **Developed by:** LumaticAI
35
- - **Shared by [Optional]:** LumaticAI
36
- - **Model type:** Language model
37
- - **Language(s) (NLP):** en, bn
38
- - **License:** apache-2.0
39
- - **Parent Model:** TinyLlama/TinyLlama-1.1B-Chat-v1.0
40
-
41
-
42
- # Uses
43
-
44
- ## Direct Use
45
-
46
- - base model for further finetuning
47
- - get an overview of how indic LLM work on specific language
48
- - for fun
49
-
50
-
51
- ## Downstream Use [Optional]
52
-
53
- - can be deployed with api
54
- - used to create webapp or app to show demo
55
-
56
-
57
- ## Out-of-Scope Use
58
-
59
- - cannot be used for production purpose
60
- - cannot be used to generate text for research or academic purposes
61
-
62
-
63
- # Bias, Risks, and Limitations
64
-
65
- Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
66
-
67
-
68
- # Training Details
69
-
70
- ## Training Data
71
-
72
- we used our dataset of 10k data which consists of Questions and Responses. The dataset name is lumatic-ai/BongChat-v0-10k.
73
-
74
- **Example Data**
75
- - &lt;|im_start|&gt;user এটি কি ইতিবাচক পর্যালোচনা? এই ফ্র্যাঞ্চাইজি কখনও হার্ট স্টপ স্টান্টস, ফাইট সিকোয়েন্স এবং তারকা শক্তি বন্ধ করে হতাশ করে না।এবার মূল চারটি ছাড়াও নতুন মুখগুলি হলেন আবুয়েলিতা চরিত্রে রিতা মোরেনো এবং দান্তে জেসন মোমোয়া যিনি শোটি চুরি করেছেন।&lt;|im_end|&gt; &lt;|im_start|&gt;assistant এটি একটি খুব ইতিবাচক পর্যালোচনা, এটি ইঙ্গিত করে যে ফ্র্যাঞ্চাইজি শ্রোতাদের জন্য আরও পছন্দসই চরিত্রগুলি যুক্ত করার পাশাপাশি অতীতের প্রত্যাশার ভিত্তিতে বিতরণ করেছে।&lt;|im_end|&gt;
76
-
77
-
78
- ## Training Procedure
79
-
80
- ### Preprocessing
81
-
82
- - Dataset Format
83
- &lt;|im_start|&gt;user &lt;question&gt;&lt;|im_end|&gt; &lt;|im_start|&gt;assistant &lt;response&gt;&lt;|im_end|&gt;
84
-
85
- ### Training hyperparameters
86
-
87
- The following hyperparameters were used during training:
88
- - learning_rate: 0.0002
89
- - train_batch_size: 4
90
- - eval_batch_size: 8
91
- - seed: 42
92
- - gradient_accumulation_steps: 2
93
- - total_train_batch_size: 8
94
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
95
- - lr_scheduler_type: cosine
96
- - lr_scheduler_warmup_ratio: 0.03
97
- - num_epochs: 3
98
- - mixed_precision_training: Native AMP
99
-
100
- ### Framework versions
101
-
102
- - Transformers 4.35.2
103
- - Pytorch 2.1.0+cu121
104
- - Datasets 2.16.1
105
- - Tokenizers 0.15.0
106
-
107
- # Evaluation
108
-
109
- ### Metrics
110
-
111
- - train/loss
112
- - steps
113
-
114
- ## Results
115
-
116
- ||\_runtime|\_timestamp|train/epoch|train/total\_flos|train/train\_loss|train/global\_step|train/train\_steps\_per\_second|train/loss|train/train\_samples\_per\_second|train/train\_runtime|\_step|train/learning\_rate|
117
- |---|---|---|---|---|---|---|---|---|---|---|---|---|
118
- |0|205\.76071906089783|1705483341\.4811552|0\.08|||100||1\.2865|||0|0\.0001869158878504673|
119
- |1|406\.9242510795593|1705483542\.6446872|0\.17|||200||1\.0698|||1|0\.00019964245392895794|
120
- |2|607\.5763952732086|1705483743\.2968314|0\.25|||300||1\.0457|||2|0\.00019846317589644678|
121
- |3|808\.9941129684448|1705483944\.714549|0\.34|||400||1\.0131|||3|0\.00019646988832610704|
122
- |4|1012\.7936038970947|1705484148\.51404|0\.42|||500||1\.0|||4|0\.00019367907001906532|
123
- |5|1217\.8231673240662|1705484353\.5436034|0\.51|||600||0\.9913|||5|0\.0001901137930801933|
124
- |6|1422\.651272058487|1705484558\.3717082|0\.59|||700||0\.9904|||6|0\.00018580353217762766|
125
- |7|1624\.9901471138|1705484760\.7105832|0\.67|||800||0\.9705|||7|0\.0001807839208713596|
126
- |8|1827\.1909170150757|1705484962\.911353|0\.76|||900||0\.9661|||8|0\.00017509645702535999|
127
- |9|2033\.6470217704773|1705485169\.3674579|0\.84|||1000||0\.9588|||9|0\.00016878815973864268|
128
- |10|2241\.5517098903656|1705485377\.272146|0\.93|||1100||0\.9469|||10|0\.00016191118063146672|
129
- |11|2446\.751221895218|1705485582\.471658|1\.01|||1200||0\.9453|||11|0\.0001545223727002313|
130
- |12|2648\.367230653763|1705485784\.0876667|1\.09|||1300||0\.9329|||12|0\.0001466828203054036|
131
- |13|2849\.9791855812073|1705485985\.6996217|1\.18|||1400||0\.9299|||13|0\.0001384573341781387|
132
- |14|3050\.282051086426|1705486186\.0024872|1\.26|||1500||0\.9181|||14|0\.00012991391562044527|
133
- |15|3252\.6823406219482|1705486388\.4027767|1\.35|||1600||0\.917|||15|0\.00012112319432843371|
134
- |16|3456\.3907039165497|1705486592\.11114|1\.43|||1700||0\.919|||16|0\.00011215784448624378|
135
- |17|3658\.387463569641|1705486794\.1078997|1\.52|||1800||0\.9156|||17|0\.00010309198395788984|
136
- |18|3860\.850716114044|1705486996\.5711522|1\.6|||1900||0\.9074|||18|9\.400056154399221e-05|
137
- |19|4063\.906144142151|1705487199\.6265802|1\.68|||2000||0\.9072|||19|8\.49587373690336e-05|
138
- |20|4266\.29203081131|1705487402\.012467|1\.77|||2100||0\.9061|||20|7\.604126152157019e-05|
139
- |21|4468\.759161949158|1705487604\.479598|1\.85|||2200||0\.9104|||21|6\.732185608427e-05|
140
- |22|4671\.109050750732|1705487806\.8294868|1\.94|||2300||0\.9016|||22|5\.8872605662626776e-05|
141
- |23|4875\.181975841522|1705488010\.902412|2\.02|||2400||0\.8957|||23|5\.076336145093832e-05|
142
- |24|5077\.5954213142395|1705488213\.3158574|2\.11|||2500||0\.8948|||24|4\.3061163762223156e-05|
143
- |25|5280\.958572149277|1705488416\.6790082|2\.19|||2600||0\.8833|||25|3\.582968779610564e-05|
144
- |26|5483\.901570320129|1705488619\.6220064|2\.27|||2700||0\.9019|||26|2\.912871722658781e-05|
145
- |27|5684\.498034954071|1705488820\.218471|2\.36|||2800||0\.8921|||27|2\.30136499616351e-05|
146
- |28|5885\.339627027512|1705489021\.0600631|2\.44|||2900||0\.8897|||28|1\.753504016053409e-05|
147
- |29|6089\.49475812912|1705489225\.2151942|2\.53|||3000||0\.8765|||29|1\.2738180295232205e-05|
148
- |30|6291\.281028032303|1705489427\.0014641|2\.61|||3100||0\.889|||30|8\.662726710819169e-06|
149
- |31|6494\.627055644989|1705489630\.3474917|2\.69|||3200||0\.8846|||31|5\.342371780697386e-06|
150
- |32|6695\.168158054352|1705489830\.8885942|2\.78|||3300||0\.8908|||32|2\.804565366782108e-06|
151
- |33|6898\.186992406845|1705490033\.9074285|2\.86|||3400||0\.885|||33|1\.0702878874610523e-06|
152
- |34|7099\.970013856888|1705490235\.69045|2\.95|||3500||0\.8871|||34|1\.5387686939386526e-07|
153
- |35|7221\.330135822296|1705490357\.050572|3\.0|8\.3571998449877e+16|0\.9397975607756582|3561|0\.491||3\.926|7259\.0631|35||
154
-
155
- # Model Examination
156
-
157
- We will be further finetuning this model on large dataset to see how it performs
158
-
159
- # Environmental Impact
160
-
161
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
162
-
163
- - **Hardware Type:** 1 X Tesla T4
164
- - **Hours used:** 2.21
165
- - **Cloud Provider:** Google Colab
166
- - **Compute Region:** India
167
- - **Carbon Emitted:** 0.14
168
-
169
- # Technical Specifications
170
-
171
- ## Model Architecture and Objective
172
-
173
- Finetuned on Tiny-Llama 1.1B Chat model
174
-
175
-
176
- ### Hardware
177
-
178
- 1 X Tesla T4
179
-
180
-
181
- # Citation
182
-
183
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
184
-
185
- **BibTeX:**
186
-
187
- @misc{BongLlama-1.1B-Chat-alpha-v0,
188
- url={[https://huggingface.co/lumatic-ai/BongLlama-1.1B-Chat-alpha-v0](https://huggingface.co/lumatic-ai/BongLlama-1.1B-Chat-alpha-v0)},
189
- title={BongLlama 1.1B Chat Aplha V0},
190
- author={LumaticAI, Rohan Shaw, Vivek Kushal, Jeet Ghosh},
191
- year={2024}, month={Jan}
192
- }
193
-
194
-
195
- # Model Card Authors
196
-
197
- lumatic-ai
198
-
199
- # Model Card Contact
200
-
201
- email : contact@lumaticai.com
202
-
203
- # How to Get Started with the Model
204
-
205
- Use the code below to get started with the model.
206
-
207
- <details>
208
- <summary> Click to expand </summary>
209
-
210
- ### Pipeline
211
-
212
- ```
213
- import torch
214
- from transformers import AutoModelForCausalLM, AutoTokenizer
215
- from transformers import pipeline
216
-
217
- def formatted_prompt(question)-> str:
218
- return f"<|im_start|>user\n{question}<|im_end|>\n<|im_start|>assistant:"
219
-
220
- hub_model_name = "lumatic-ai/BongLlama-1.1B-Chat-alpha-v0"
221
-
222
- tokenizer = AutoTokenizer.from_pretrained(hub_model_name)
223
- pipe = pipeline(
224
- "text-generation",
225
- model=hub_model_name,
226
- torch_dtype=torch.float16,
227
- device_map="auto",
228
- )
229
-
230
- from time import perf_counter
231
- start_time = perf_counter()
232
-
233
- prompt = formatted_prompt('হ্যালো')
234
- sequences = pipe(
235
- prompt,
236
- do_sample=True,
237
- temperature=0.1,
238
- top_p=0.9,
239
- num_return_sequences=1,
240
- eos_token_id=tokenizer.eos_token_id,
241
- max_new_tokens=256
242
- )
243
- for seq in sequences:
244
- print(f"Result: {seq['generated_text']}")
245
-
246
- output_time = perf_counter() - start_time
247
- print(f"Time taken for inference: {round(output_time,2)} seconds")
248
- ```
249
-
250
- ### Streaming Response (ChatGPT, Bard like)
251
-
252
- ```
253
- import torch
254
- from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
255
-
256
- def formatted_prompt(question)-> str:
257
- return f"<|im_start|>user\n{question}<|im_end|>\n<|im_start|>assistant:"
258
-
259
- hub_model_name = "lumatic-ai/BongLlama-1.1B-Chat-alpha-v0"
260
-
261
- tokenizer = AutoTokenizer.from_pretrained(hub_model_name)
262
- model = AutoModelForCausalLM.from_pretrained(hub_model_name)
263
-
264
- prompt = formatted_prompt('prompt here')
265
- inputs = tokenizer([prompt], return_tensors="pt")
266
- streamer = TextStreamer(tokenizer)
267
- _ = model.generate(**inputs, eos_token_id=[tokenizer.eos_token_id],streamer=streamer, max_new_tokens=256)
268
- ```
269
-
270
- ### Using Generation Config
271
-
272
- ```
273
- import torch
274
- from transformers import GenerationConfig
275
- from time import perf_counter
276
-
277
- def formatted_prompt(question)-> str:
278
- return f"<|im_start|>user\n{question}<|im_end|>\n<|im_start|>assistant:"
279
-
280
- hub_model_name = "lumatic-ai/BongLlama-1.1B-Chat-alpha-v0"
281
-
282
- tokenizer = AutoTokenizer.from_pretrained(hub_model_name)
283
- model = AutoModelForCausalLM.from_pretrained(hub_model_name)
284
-
285
- prompt = formatted_prompt('হ্যালো')
286
-
287
- # Check for GPU availability
288
- if torch.cuda.is_available():
289
- device = "cuda"
290
- else:
291
- device = "cpu"
292
-
293
- # Move model and inputs to the GPU (if available)
294
- model.to(device)
295
- inputs = tokenizer(prompt, return_tensors="pt").to(device)
296
-
297
- generation_config = GenerationConfig(
298
- penalty_alpha=0.6,
299
- do_sample=True,
300
- top_k=5,
301
- temperature=0.5,
302
- repetition_penalty=1.2,
303
- max_new_tokens=256,
304
- pad_token_id=tokenizer.eos_token_id
305
- )
306
-
307
- start_time = perf_counter()
308
- outputs = model.generate(**inputs, generation_config=generation_config)
309
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))
310
- output_time = perf_counter() - start_time
311
- print(f"Time taken for inference: {round(output_time, 2)} seconds")
312
- ```
313
-
314
- </details>