license: apache-2.0 | |
# Finetuning llama 2 on our celebrity news dataset located [here](https://huggingface.co/datasets/2nji/makebelieve-480) | |
Disclaimer: This is still work in progress as we need to preprocess our celebrity news dataset to match Llama 2's prompt format as described [here](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) | |
## Reserve GPU on g5k | |
Log into your Grid5000 account using ssh and run the following code in the terminal | |
```script | |
oarsub -l gpu=4 -I -q production | |
``` | |
Wait till GPUs are available and assigned to you, if you need more information about g5k, you can refer to [here](https://www.grid5000.fr/w/Getting_Started) | |
### Create a virtual environment | |
- Installing PIP | |
```script | |
pip install virtualenv | |
``` | |
- Creating environment | |
```script | |
virtualenv venv | |
``` | |
- Activating environment | |
```script | |
source venv/bin/activate | |
``` | |
## Install requirements file | |
```script | |
pip install -r requirements.txt | |
``` | |
## Running the script to finetune Llama-2-7b-chat-hf and push to huggingface model repository | |
```script | |
python makebelieve.py | |
``` | |