SpeedStar101 commited on
Commit
904b9ac
1 Parent(s): 940bb60

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -38
README.md CHANGED
@@ -111,6 +111,7 @@ Please note that the code assumes you have access to the Starcodium/VergilGPT2 m
111
  ## Installation
112
 
113
  Make sure to install the required dependencies by running the following commands:
 
114
 
115
  ```python
116
  !pip install torch
@@ -145,24 +146,6 @@ tokenizer = AutoTokenizer.from_pretrained(model_id)
145
  model = AutoModelForCausalLM.from_pretrained(model_id)
146
  ```
147
 
148
- For loading the original GPT2 model in 4-bit and applying quantization for better results, as well as utilizing bfloat16 compute dtype and nested quantization for memory efficiency during model loading, use the following example:
149
-
150
- ```python
151
- import torch
152
- from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
153
-
154
- model_id = "gpt2"
155
- bnb_config = BitsAndBytesConfig(
156
- load_in_4bit=True,
157
- bnb_4bit_use_double_quant=True,
158
- bnb_4bit_quant_type="nf4",
159
- bnb_4bit_compute_dtype=torch.bfloat16
160
- )
161
-
162
- tokenizer = AutoTokenizer.from_pretrained(model_id)
163
- model_4bit = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")
164
- ```
165
-
166
  To load the GPT2 model with the allenai/soda dataset, follow this example:
167
 
168
  ```python
@@ -192,7 +175,7 @@ dataset = dataset.map(preprocess_dataset)
192
 
193
  ## Loading & Training VergilGPT2
194
 
195
- To load the original VergilGPT2 model for training, you can use the following example:
196
  ```python
197
  from transformers import AutoTokenizer, AutoModelForCausalLM
198
 
@@ -201,24 +184,6 @@ tokenizer = AutoTokenizer.from_pretrained(model_id)
201
  model = AutoModelForCausalLM.from_pretrained(model_id)
202
  ```
203
 
204
- For loading the VergilGPT2 model in 4-bit and applying quantization for better results, as well as utilizing bfloat16 compute dtype and nested quantization for memory efficiency during model loading, use the following example:
205
-
206
- ```python
207
- import torch
208
- from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
209
-
210
- model_id = "VergilGPT2"
211
- bnb_config = BitsAndBytesConfig(
212
- load_in_4bit=True,
213
- bnb_4bit_use_double_quant=True,
214
- bnb_4bit_quant_type="nf4",
215
- bnb_4bit_compute_dtype=torch.bfloat16
216
- )
217
-
218
- tokenizer = AutoTokenizer.from_pretrained(model_id)
219
- model_4bit = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")
220
- ```
221
-
222
  To load the VergilGPT2 model with the allenai/soda dataset, follow this example:
223
 
224
  ```python
@@ -249,7 +214,7 @@ dataset = dataset.map(preprocess_dataset)
249
  train_dataset, val_dataset = train_test_split(dataset['train'], test_size=0.1, shuffle=True)
250
  ```
251
 
252
- It is worth noting that VergilGPT2 is already trained on the allensi/soda dataset so in actual training be sure to change the conversational dialogue.
253
 
254
  ## Text Files
255
 
 
111
  ## Installation
112
 
113
  Make sure to install the required dependencies by running the following commands:
114
+ (Note these installations were done in google collaboratory, if you are installing them on your local PC take out the '!')
115
 
116
  ```python
117
  !pip install torch
 
146
  model = AutoModelForCausalLM.from_pretrained(model_id)
147
  ```
148
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
149
  To load the GPT2 model with the allenai/soda dataset, follow this example:
150
 
151
  ```python
 
175
 
176
  ## Loading & Training VergilGPT2
177
 
178
+ To load the VergilGPT2 model for training, you can use the following example:
179
  ```python
180
  from transformers import AutoTokenizer, AutoModelForCausalLM
181
 
 
184
  model = AutoModelForCausalLM.from_pretrained(model_id)
185
  ```
186
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
187
  To load the VergilGPT2 model with the allenai/soda dataset, follow this example:
188
 
189
  ```python
 
214
  train_dataset, val_dataset = train_test_split(dataset['train'], test_size=0.1, shuffle=True)
215
  ```
216
 
217
+ It is worth noting that VergilGPT2 is already trained on the allenai/soda dataset so in actual training be sure to change the conversational dialogue.
218
 
219
  ## Text Files
220