mwz commited on
Commit
a7cb625
1 Parent(s): 9c483b8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -97
README.md CHANGED
@@ -1,97 +0,0 @@
1
- ---
2
- license: apache-2.0
3
- base_model: bn22/Mistral-7B-Instruct-v0.1-sharded
4
- tags:
5
- - generated_from_trainer
6
- model-index:
7
- - name: outputs
8
- results: []
9
- ---
10
-
11
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
- should probably proofread and complete it, then remove this comment. -->
13
-
14
- # UrduNewsMistral7B
15
-
16
- This model is a fine-tuned version of [bn22/Mistral-7B-Instruct-v0.1-sharded](https://huggingface.co/bn22/Mistral-7B-Instruct-v0.1-sharded) on [Urdu-Instruct-News-Article-Generation](https://huggingface.co/datasets/AhmadMustafa/Urdu-Instruct-News-Article-Generation) dataset.
17
- It achieves the following results on the evaluation set:
18
- - Loss: 1.1925
19
-
20
- ## Usage
21
- Here is an example of how you would load:
22
- ```python
23
- import torch
24
- from peft import PeftModel
25
- from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
26
-
27
- model_name = "bn22/Mistral-7B-Instruct-v0.1-sharded"
28
- adapters_name = 'mwz/UrduNewsMistral7B'
29
-
30
- model = AutoModelForCausalLM.from_pretrained(
31
- model_name,
32
- load_in_4bit=True,
33
- torch_dtype=torch.bfloat16,
34
- device_map="auto",
35
- max_memory= {i: '24000MB' for i in range(torch.cuda.device_count())},
36
- quantization_config=BitsAndBytesConfig(
37
- load_in_4bit=True,
38
- bnb_4bit_compute_dtype=torch.bfloat16,
39
- bnb_4bit_use_double_quant=True,
40
- bnb_4bit_quant_type='nf4'
41
- ),
42
- )
43
- model = PeftModel.from_pretrained(model, adapters_name)
44
- tokenizer = AutoTokenizer.from_pretrained(model_name)
45
-
46
- ```
47
- Inference can then be performed as usual with HF models as follows:
48
- ```python
49
- prompt = "پیٹرول کی قیمت میں 2روپے 50 پیسے اضافہ"
50
- formatted_prompt = (
51
- f"اس دی گی ایک خبر سے متعلق ایک مضمون لکھیں"
52
- f"### Human: {prompt} ### Assistant:"
53
- )
54
- inputs = tokenizer(formatted_prompt, return_tensors="pt").to("cuda:0")
55
- outputs = model.generate(inputs=inputs.input_ids, max_new_tokens=20)
56
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))
57
- ```
58
- Expected output similar to the following:
59
- ```
60
- اس دی گی ایک خبر سے متعلق ایک مضمون لکھیں۔ خبر:### Human: پیٹرول کی قیمت میں 2روپے 50 پیسے اضافہ ### Assistant: بہترین پیٹرول کے ان
61
- ```
62
-
63
-
64
- ## Training procedure
65
-
66
- ### Training hyperparameters
67
-
68
- The following hyperparameters were used during training:
69
- - learning_rate: 0.0002
70
- - train_batch_size: 16
71
- - eval_batch_size: 8
72
- - seed: 42
73
- - gradient_accumulation_steps: 4
74
- - total_train_batch_size: 64
75
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
76
- - lr_scheduler_type: cosine
77
- - lr_scheduler_warmup_ratio: 0.05
78
- - num_epochs: 50
79
- - mixed_precision_training: Native AMP
80
-
81
- ### Training results
82
-
83
- | Training Loss | Epoch | Step | Validation Loss |
84
- |:-------------:|:-----:|:----:|:---------------:|
85
- | 1.1387 | 5.71 | 10 | 1.2070 |
86
- | 0.9337 | 11.43 | 20 | 1.1634 |
87
- | 0.8676 | 17.14 | 30 | 1.1697 |
88
- | 0.8065 | 22.86 | 40 | 1.1868 |
89
- | 0.7759 | 28.57 | 50 | 1.1925 |
90
-
91
-
92
- ### Framework versions
93
-
94
- - Transformers 4.36.0.dev0
95
- - Pytorch 2.1.0+cu121
96
- - Datasets 2.15.0
97
- - Tokenizers 0.15.0