malhajar commited on
Commit
8e73109
1 Parent(s): 2579608

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - malhajar/alpaca-gpt4-ar
5
+ language:
6
+ - ar
7
+ - en
8
+ ---
9
+
10
+
11
+ # Model Card for Model ID
12
+
13
+ <!-- Provide a quick summary of what the model is/does. -->
14
+ malhajar/Mistral-7B-Instruct-v0.2-turkish is a finetuned version of [`Mistral-7B-v0.1`]( https://huggingface.co/mistralai/Mistral-7B-v0.1) using SFT Training and Freeze method.
15
+ This model can answer information in a chat format as it is finetuned specifically on instructions specifically [`alpaca-gpt4-ar`]( https://huggingface.co/datasets/malhajar/alpaca-gpt4-ar)
16
+
17
+ ### Model Description
18
+
19
+ - **Developed by:** [`Mohamad Alhajar`](https://www.linkedin.com/in/muhammet-alhajar/)
20
+ - **Language(s) (NLP):** Arabic
21
+ - **Finetuned from model:** [`mistralai/Mistral-7B-v0.1`](https://huggingface.co/mistralai/Mistral-7B-v0.1)
22
+
23
+ ### Prompt Template
24
+ ```
25
+ ### Instruction:
26
+
27
+ <prompt> (without the <>)
28
+
29
+ ### Response:
30
+ ```
31
+
32
+ ## How to Get Started with the Model
33
+
34
+ Use the code sample provided in the original post to interact with the model.
35
+ ```python
36
+ from transformers import AutoTokenizer,AutoModelForCausalLM
37
+
38
+ model_id = "malhajar/Mistral-7B-v0.1-arabic"
39
+ model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
40
+ device_map="auto",
41
+ torch_dtype=torch.float16,
42
+ revision="main")
43
+
44
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
45
+
46
+ question: "ما هي الحياة؟"
47
+ # For generating a response
48
+ prompt = '''
49
+ ### Instruction: {question} ### Response:
50
+ '''
51
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids
52
+ output = model.generate(inputs=input_ids,max_new_tokens=512,pad_token_id=tokenizer.eos_token_id,top_k=50, do_sample=True,repetition_penalty=1.3
53
+ top_p=0.95,trust_remote_code=True,)
54
+ response = tokenizer.decode(output[0])
55
+
56
+ print(response)
57
+ ```