damerajee's picture
Update README.md
409d6d9
|
raw
history blame
No virus
2.22 kB
---
license: llama2
base_model: codellama/CodeLlama-7b-hf
tags:
- generated_from_trainer
model-index:
- name: codellama2-finetuned-codex-py
results: []
datasets:
- iamtarun/python_code_instructions_18k_alpaca
language:
- en
pipeline_tag: text-generation
---
# codellama2-finetuned-codex-py
This model is a fine-tuned version of [codellama/CodeLlama-7b-hf](https://huggingface.co/codellama/CodeLlama-7b-hf) on the [iamtarun/python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) dataset.
## Model description
More information needed
## Intended uses & limitations
More information needed
## Example Use Cases:
```
from transformers import AutoTokenizer
from transformers import pipeline
import torch
tokenizer = AutoTokenizer.from_pretrained("damerajee/codellama2-finetuned-alpaca-18k-fin")
pipe = pipeline(
"text-generation",
model="damerajee/codellama2-finetuned-alpaca-18k-fin",
torch_dtype=torch.float16,
device_map="auto",
)
text = "write a function that takes in print out each individual characters in a string"
sequences = pipe(
text,
do_sample=True,
temperature=0.1,
top_p=0.7,
num_return_sequences=1,
eos_token_id=tokenizer.eos_token_id,
max_length=70,
)
for seq in sequences:
print(f"Result: {seq['generated_text']}")
```
## Training and evaluation data
| Step | Training Loss |
|------|---------------|
| 10 | 0.792200 |
| 20 | 0.416100 |
| 30 | 0.348600 |
| 40 | 0.323200 |
| 50 | 0.316300 |
| 60 | 0.317500 |
| 70 | 0.333600 |
| 80 | 0.329500 |
| 90 | 0.333400 |
| 100 | 0.309900 |
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0002
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- training_steps: 100
- mixed_precision_training: Native AMP
### Training results
### Framework versions
- Transformers 4.36.0.dev0
- Pytorch 2.0.0
- Datasets 2.1.0
- Tokenizers 0.15.0