AhmedSSoliman commited on
Commit
cdf0b51
1 Parent(s): 11aad8f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +94 -2
README.md CHANGED
@@ -1,9 +1,101 @@
1
  ---
2
  tags:
 
3
  - autotrain
4
  - text-generation
 
 
 
 
 
 
 
5
  widget:
6
- - text: "I love AutoTrain because "
 
 
7
  ---
8
 
9
- # Model Trained Using AutoTrain
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  tags:
3
+ - Code-Generation
4
  - autotrain
5
  - text-generation
6
+ - Llama2
7
+ - Pytorch
8
+ - PEFT
9
+ - QLoRA
10
+ - code
11
+ - coding
12
+ pipeline_tag: text-generation
13
  widget:
14
+ - text: 'Write a program that add five numbers'
15
+ - text: 'Write a python code for reading multiple images'
16
+ - text: 'Write a python code for the name Ahmed to be in a reversed order'
17
  ---
18
 
19
+ # LlaMa2-CodeGen
20
+ This model is **LlaMa-2 7b** fine-tuned on the **CodeSearchNet dataset instructions dataset** by using the method **QLoRA** with [PEFT](https://github.com/huggingface/peft) library.
21
+
22
+ # Model Trained on Google Colab Pro Using AutoTrain, PEFT and QLoRA
23
+
24
+
25
+
26
+
27
+
28
+ ## Llama-2 description
29
+
30
+ [Llama-2](https://huggingface.co/meta-llama/Llama-2-7b)
31
+
32
+ Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters.
33
+ Meta developed and publicly released the Llama 2 family of large language models (LLMs), a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama-2-Chat, are optimized for dialogue use cases. Llama-2-Chat models outperform open-source chat models on most benchmarks we tested, and in our human evaluations for helpfulness and safety, are on par with some popular closed-source models like ChatGPT and PaLM.
34
+
35
+
36
+
37
+
38
+ ### Example
39
+ ```py
40
+
41
+
42
+ import torch
43
+ from peft import PeftModel, PeftConfig
44
+ from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
45
+
46
+ peft_model_id = "AhmedSSoliman/Llama2-CodeGen-PEFT-QLora"
47
+ config = PeftConfig.from_pretrained(peft_model_id)
48
+ model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, trust_remote_code=True, return_dict=True, load_in_4bit=True, device_map='auto')
49
+ tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
50
+
51
+ # Load the Lora model
52
+ model = PeftModel.from_pretrained(model, peft_model_id)
53
+
54
+
55
+
56
+ def create_prompt(instruction):
57
+ system = "You are a coding assistant that will help the user to resolve the following instruction:"
58
+ instruction = "\n### Input: " + instruction
59
+ return system + "\n" + instruction + "\n\n" + "### Response:" + "\n"
60
+
61
+ def generate(
62
+ instruction,
63
+ max_new_tokens=128,
64
+ temperature=0.1,
65
+ top_p=0.75,
66
+ top_k=40,
67
+ num_beams=4,
68
+ **kwargs,
69
+ ):
70
+ prompt = create_prompt(instruction)
71
+ print(prompt)
72
+ inputs = tokenizer(prompt, return_tensors="pt")
73
+ input_ids = inputs["input_ids"].to("cuda")
74
+ attention_mask = inputs["attention_mask"].to("cuda")
75
+ generation_config = GenerationConfig(
76
+ temperature=temperature,
77
+ top_p=top_p,
78
+ top_k=top_k,
79
+ num_beams=num_beams,
80
+ **kwargs,
81
+ )
82
+ with torch.no_grad():
83
+ generation_output = model.generate(
84
+ input_ids=input_ids,
85
+ attention_mask=attention_mask,
86
+ generation_config=generation_config,
87
+ return_dict_in_generate=True,
88
+ output_scores=True,
89
+ max_new_tokens=max_new_tokens,
90
+ early_stopping=True
91
+ )
92
+ s = generation_output.sequences[0]
93
+ output = tokenizer.decode(s)
94
+ return output.split("### Response:")[1].lstrip("\n")
95
+
96
+
97
+ instruction = """
98
+ Write a python code for the name Ahmed to be in a reversed order
99
+ """
100
+ print(generate(instruction))
101
+ ```