Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,7 @@
|
|
2 |
datasets:
|
3 |
- uonlp/CulturaX
|
4 |
- l3cube-pune/MarathiNLP
|
5 |
-
|
6 |
|
7 |
language:
|
8 |
- mr
|
@@ -23,9 +23,6 @@ license: llama2
|
|
23 |
|
24 |
Built by - [smallstep.ai](https://smallstep.ai/)
|
25 |
|
26 |
-
## What is Misal?
|
27 |
-
|
28 |
-
Misal 7B, a pretrained and instruction tuned large language model based on Meta’s Llama 7B architecture exclusively for Marathi.
|
29 |
|
30 |
## Making of Misal?
|
31 |
|
@@ -65,50 +62,6 @@ peft:
|
|
65 |
|
66 |
The model inherits the license from meta-llama/Llama-2-7b.
|
67 |
|
68 |
-
## Usage
|
69 |
-
|
70 |
-
### Installation
|
71 |
-
|
72 |
-
```bash
|
73 |
-
pip install transformers accelerate
|
74 |
-
```
|
75 |
-
|
76 |
-
### Prompt
|
77 |
-
|
78 |
-
```python
|
79 |
-
आपण एक मदतगार, आदरणीय आणि प्रामाणिक सहाय्यक आहात.नेहमी शक्य तितकी उपयुक्त उत्तर द्या. तुमची उत्तरे हानिकारक, अनैतिक, वर्णद्वेषी, लैंगिकतावादी, हानिकारक, धोकादायक किंवा बेकायदेशीर नसावीत. कृपया खात्री करा की तुमची उत्तरे सामाजिक दृष्टिकोनाने निष्पक्ष आणि सकारात्मक स्वरूपाची आहेत. जर एखाद्या प्रश्नाला काही अर्थ नसेल किंवा वस्तुस्थितीशी सुसंगती नसेल, तर उत्तर देण्याऐवजी काहीतरी बरोबर का नाही हे स्पष्ट करा. तुम्हाला एखाद्या प्रश्नाचे उत्तर माहित नसल्यास, कृपया चुकीची माहिती देऊ नये.
|
80 |
-
|
81 |
-
### Instruction:
|
82 |
-
|
83 |
-
<instruction>
|
84 |
-
|
85 |
-
### Input:
|
86 |
-
|
87 |
-
<input data>
|
88 |
-
|
89 |
-
### Response:
|
90 |
-
```
|
91 |
-
|
92 |
-
### PyTorch
|
93 |
-
|
94 |
-
```python
|
95 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
96 |
-
device = "cuda"
|
97 |
-
model = AutoModelForCausalLM.from_pretrained("smallstepai/Misal-7B-instruct-v0.1", torch_dtype=torch.bfloat16, device_map='auto')
|
98 |
-
tokenizer = AutoTokenizer.from_pretrained("smallstepai/Misal-7B-instruct-v0.1")
|
99 |
-
|
100 |
-
def ask_misal(model, tokenizer, instruction, inputs='', system_prompt='', max_new_tokens=200, device='cuda'):
|
101 |
-
|
102 |
-
ip = dict(system_prompt=system_prompt, instruction=instruction, inputs=inputs)
|
103 |
-
model_inputs = tokenizer.apply_chat_template(ip, return_tensors='pt')
|
104 |
-
outputs = model.generate(model_inputs.to(device), max_new_tokens=max_new_tokens)
|
105 |
-
response = tokenizer.decode(outputs[0]).split('### Response:')[1].strip()
|
106 |
-
return response
|
107 |
-
|
108 |
-
instruction="सादरीकरण कसे करावे?"
|
109 |
-
resp = ask_misal(model, tokenizer, instruction=instruction, max_new_tokens=1024)
|
110 |
-
print(resp)
|
111 |
-
```
|
112 |
|
113 |
### Team
|
114 |
|
|
|
2 |
datasets:
|
3 |
- uonlp/CulturaX
|
4 |
- l3cube-pune/MarathiNLP
|
5 |
+
- ai4bharat/samanantar
|
6 |
|
7 |
language:
|
8 |
- mr
|
|
|
23 |
|
24 |
Built by - [smallstep.ai](https://smallstep.ai/)
|
25 |
|
|
|
|
|
|
|
26 |
|
27 |
## Making of Misal?
|
28 |
|
|
|
62 |
|
63 |
The model inherits the license from meta-llama/Llama-2-7b.
|
64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
|
66 |
### Team
|
67 |
|