Update README.md
Browse files
README.md
CHANGED
@@ -11,16 +11,20 @@ library_name: adapter-transformers
|
|
11 |
|
12 |
## Model Description
|
13 |
|
14 |
-
|
|
|
|
|
|
|
15 |
|
16 |
### Model Architecture
|
17 |
|
18 |
- **Base Model**: `facebook/opt-350m`
|
19 |
-
- **Fine-tuning**:
|
20 |
|
21 |
## Training Data
|
22 |
|
23 |
The model was trained on the `lucasmccabe-lmi/CodeAlpaca-20k` dataset. This dataset contains code-related prompts and their corresponding outputs.
|
|
|
24 |
|
25 |
## Training Procedure
|
26 |
|
@@ -61,7 +65,7 @@ from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
61 |
tokenizer = AutoTokenizer.from_pretrained("facebook/opt350m")
|
62 |
model = AutoModelForCausalLM.from_pretrained("harpomaxx/opt350m-codealpaca20k)
|
63 |
|
64 |
-
prompt = "
|
65 |
inputs = tokenizer.encode(prompt, return_tensors="pt")
|
66 |
outputs = model.generate(inputs)
|
67 |
decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
|
|
11 |
|
12 |
## Model Description
|
13 |
|
14 |
+
An opt-350m model trained on the CodeAlpaca 20k dataset using quantization and Progressive Embedding Fine-Tuning (PEFT).
|
15 |
+
The resulting model is designed to understand and generate code-related responses based on the prompts provided.
|
16 |
+
|
17 |
+
[original model car](https://huggingface.co/facebook/opt-350m)
|
18 |
|
19 |
### Model Architecture
|
20 |
|
21 |
- **Base Model**: `facebook/opt-350m`
|
22 |
+
- **Fine-tuning**: Parameter-Efficient Fine-Tuning (PEFT)
|
23 |
|
24 |
## Training Data
|
25 |
|
26 |
The model was trained on the `lucasmccabe-lmi/CodeAlpaca-20k` dataset. This dataset contains code-related prompts and their corresponding outputs.
|
27 |
+
Script used for training is avaiable [here](https://github.com/harpomaxx/llm-finetuning/blob/0954a7ca16bb25bdef6ee9dd1089867bd4d8e0a5/code/python/scripts/stf_train_opt350m.py)
|
28 |
|
29 |
## Training Procedure
|
30 |
|
|
|
65 |
tokenizer = AutoTokenizer.from_pretrained("facebook/opt350m")
|
66 |
model = AutoModelForCausalLM.from_pretrained("harpomaxx/opt350m-codealpaca20k)
|
67 |
|
68 |
+
prompt = "Question: [Your code-related question here] ### Answer: "
|
69 |
inputs = tokenizer.encode(prompt, return_tensors="pt")
|
70 |
outputs = model.generate(inputs)
|
71 |
decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
|