File size: 4,070 Bytes
d346fb2
 
 
 
 
 
 
ba6509e
d346fb2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9cd94dc
 
1708595
9cd94dc
d346fb2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a05273e
d346fb2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9cd94dc
d346fb2
 
 
 
 
 
 
 
 
 
 
 
10dfdbe
d346fb2
a05273e
d346fb2
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
---
license: llama2
datasets:
- AlfredPros/smart-contracts-instructions
language:
- en
tags:
- code
- blockchain
- solidity
- smart contract
---
# Code LLaMA 7b Instruct Solidity

A finetuned 7 billion parameters Code LLaMA - Instruct model to generate Solidity smart contract using 4-bit QLoRA finetuning provided by PEFT library.

# Training Dataset

Dataset used to finetune the model is AlfredPros' Smart Contracts Instructions (https://huggingface.co/datasets/AlfredPros/smart-contracts-instructions). 
A dataset containing 6,003 GPT-generated human instruction and Solidity source code data pairs. This dataset has been processed for training LLMs.

# Training Parameters

## Bitsandbytes quantization configurations
- Load in 4-bit: true
- 4-bit quantization type: NF4
- 4-bit compute dtype: float16
- 4-bit use double quantization: true

## Supervised finetuning trainer parameters
- Number of train epochs: 1
- FP16: true
- FP16 option level: O1
- BF16: false
- Per device train batch size: 1
- Gradient accumulation steps: 1
- Gradient checkpointing: true
- Max gradient normal: 0.3
- Learning rate: 2e-4
- Weight decay: 0.001
- Optimizer: paged AdamW 32-bit
- Learning rate scheduler type: cosine
- Warmup ratio: 0.03

# Training Details
- GPU used: 1x NVIDIA GeForce GTX 1080Ti
- Training time: 21 hours, 4 minutes, and 57 seconds

# Training Loss
```
Step	Training Loss
 100	0.330900
 200	0.293000
 300	0.276500
 400	0.290900
 500	0.306100
 600	0.302600
 700	0.337200
 800	0.295000
 900	0.297800
1000	0.299500
1100	0.268900
1200	0.257800
1300	0.264100
1400	0.294400
1500	0.293900
1600	0.287600
1700	0.281200
1800	0.273400
1900	0.266600
2000	0.227500
2100	0.261600
2200	0.275700
2300	0.290100
2400	0.290900
2500	0.316200
2600	0.296500
2700	0.291400
2800	0.253300
2900	0.321500
3000	0.269500
3100	0.295600
3200	0.265800
3300	0.262800
3400	0.274900
3500	0.259800
3600	0.226300
3700	0.325700
3800	0.249000
3900	0.237200
4000	0.251400
4100	0.247000
4200	0.278700
4300	0.264000
4400	0.245000
4500	0.235900
4600	0.240400
4700	0.235200
4800	0.220300
4900	0.202700
5000	0.240500
5100	0.258500
5200	0.236300
5300	0.267500
5400	0.236700
5500	0.265900
5600	0.244900
5700	0.297900
5800	0.281200
5900	0.313800
6000	0.249800
6003	0.271939
```

# Example Usage
```py
from transformers import BitsAndBytesConfig, AutoTokenizer, AutoModelForCausalLM
import torch
import accelerate

use_4bit = True
bnb_4bit_compute_dtype = "float16"
bnb_4bit_quant_type = "nf4"
use_double_nested_quant = True
compute_dtype = getattr(torch, bnb_4bit_compute_dtype)

# BitsAndBytesConfig 4-bit config
bnb_config = BitsAndBytesConfig(
    load_in_4bit=use_4bit,
    bnb_4bit_use_double_quant=use_double_nested_quant,
    bnb_4bit_quant_type=bnb_4bit_quant_type,
    bnb_4bit_compute_dtype=compute_dtype,
    load_in_8bit_fp32_cpu_offload=True
)

# Load model in 4-bit
tokenizer = AutoTokenizer.from_pretrained("AlfredPros/CodeLlama-7b-Instruct-Solidity")
model = AutoModelForCausalLM.from_pretrained("AlfredPros/CodeLlama-7b-Instruct-Solidity", quantization_config=bnb_config, device_map="balanced_low_0")

# Make input
input='Make a smart contract to create a whitelist of approved wallets. The purpose of this contract is to allow the DAO (Decentralized Autonomous Organization) to approve or revoke certain wallets, and also set a checker address for additional validation if needed. The current owner address can be changed by the current owner.'

# Make prompt template
prompt = f"""### Instruction:
Use the Task below and the Input given to write the Response, which is a programming code that can solve the following Task:

### Task:
{input}

### Solution:
"""

# Tokenize the input
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda()
# Run the model to infere an output
outputs = model.generate(input_ids=input_ids, max_new_tokens=1024, do_sample=True, top_p=0.9, temperature=0.001, pad_token_id=1)

# Detokenize and display the generated output
print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(prompt):])
```