|
--- |
|
tags: |
|
- autotrain |
|
- text-generation-inference |
|
- text-generation |
|
- peft |
|
library_name: transformers |
|
widget: |
|
- messages: |
|
- role: user |
|
content: What is your favorite condiment? |
|
license: other |
|
--- |
|
|
|
# Model Trained Using AutoTrain |
|
|
|
This model was trained using AutoTrain. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain). |
|
|
|
# Model Trained Using AutoTrain |
|
|
|
This model was trained using AutoTrain. For more information, please visit [AutoTrain](https://hf.co/docs/autotrain). |
|
|
|
# Usage |
|
|
|
```python |
|
|
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline |
|
|
|
torch.random.manual_seed(0) |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
"styalai/competition-math-phinetune-v1", q |
|
device_map="cuda", |
|
torch_dtype="auto", |
|
trust_remote_code=True, |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained("styalai/competition-math-phinetune-v1") |
|
|
|
messages = [ |
|
{"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"}, |
|
] |
|
|
|
pipe = pipeline( |
|
"text-generation", |
|
model=model, |
|
tokenizer=tokenizer, |
|
) |
|
|
|
generation_args = { |
|
"max_new_tokens": 500, |
|
"return_full_text": False, |
|
"temperature": 0.0, |
|
"do_sample": False, |
|
} |
|
|
|
output = pipe(messages, **generation_args) |
|
print(output[0]['generated_text']) |
|
``` |
|
|
|
# Info |
|
|
|
Fine-tune from styalai/phi-ne-tuning-1-4 who it fine tune from phi-3 |
|
|
|
parameters of autotrain : |
|
```python |
|
project_name = 'competition-math-phinetune-v1-1' # @param {type:"string"} |
|
model_name = "styalai/competition-math-phinetune-v1" #'microsoft/Phi-3-mini-4k-instruct' # @param {type:"string"} |
|
|
|
#@markdown --- |
|
#@markdown #### Push to Hub? |
|
#@markdown Use these only if you want to push your trained model to a private repo in your Hugging Face Account |
|
#@markdown If you dont use these, the model will be saved in Google Colab and you are required to download it manually. |
|
#@markdown Please enter your Hugging Face write token. The trained model will be saved to your Hugging Face account. |
|
#@markdown You can find your token here: https://huggingface.co/settings/tokens |
|
push_to_hub = True # @param ["False", "True"] {type:"raw"} |
|
hf_token = "hf_****" #@param {type:"string"} |
|
|
|
|
|
#@markdown --- |
|
#@markdown #### Hyperparameters |
|
learning_rate = 3e-4 # @param {type:"number"} |
|
num_epochs = 1 #@param {type:"number"} |
|
batch_size = 1 # @param {type:"slider", min:1, max:32, step:1} |
|
block_size = 1024 # @param {type:"number"} |
|
trainer = "sft" # @param ["default", "sft"] {type:"raw"} |
|
warmup_ratio = 0.1 # @param {type:"number"} |
|
weight_decay = 0.01 # @param {type:"number"} |
|
gradient_accumulation = 4 # @param {type:"number"} |
|
mixed_precision = "fp16" # @param ["fp16", "bf16", "none"] {type:"raw"} |
|
peft = True # @param ["False", "True"] {type:"raw"} |
|
quantization = "int4" # @param ["int4", "int8", "none"] {type:"raw"} |
|
lora_r = 16 #@param {type:"number"} |
|
lora_alpha = 32 #@param {type:"number"} |
|
lora_dropout = 0.05 #@param {type:"number"} |
|
|
|
code for the creation of the dataset : |
|
from datasets import load_dataset |
|
dataset = load_dataset("camel-ai/math")#, streaming=True) |
|
|
|
import pandas as pd |
|
data = {"text":[]} |
|
|
|
msg1 = dataset["train"]["message_1"] |
|
msg2 = dataset["train"]["message_2"] |
|
|
|
for i in range(3500, 7000): |
|
user = "<|user|>"+ msg1[i] +"<|end|>\n" |
|
phi = "<|assistant|>"+ msg2[i] +"<|end|>" |
|
prompt = user+phi |
|
data["text"].append(prompt) |
|
|
|
data = pd.DataFrame.from_dict(data) |
|
print(data) |
|
#os.mkdir("/kaggle/working/data") |
|
data.to_csv('data/dataset.csv', index=False, escapechar='\\') |
|
|
|
!autotrain llm \ |
|
--train \ |
|
--username "styalai" \ |
|
--merge-adapter \ |
|
--model ${MODEL_NAME} \ |
|
--project-name ${PROJECT_NAME} \ |
|
--data-path data/ \ |
|
--text-column text \ |
|
--lr ${LEARNING_RATE} \ |
|
--batch-size ${BATCH_SIZE} \ |
|
--epochs ${NUM_EPOCHS} \ |
|
--block-size ${BLOCK_SIZE} \ |
|
--warmup-ratio ${WARMUP_RATIO} \ |
|
--lora-r ${LORA_R} \ |
|
--lora-alpha ${LORA_ALPHA} \ |
|
--lora-dropout ${LORA_DROPOUT} \ |
|
--weight-decay ${WEIGHT_DECAY} \ |
|
--gradient-accumulation ${GRADIENT_ACCUMULATION} \ |
|
--quantization ${QUANTIZATION} \ |
|
--mixed-precision ${MIXED_PRECISION} \ |
|
$( [[ "$PEFT" == "True" ]] && echo "--peft" ) \ |
|
$( [[ "$PUSH_TO_HUB" == "True" ]] && echo "--push-to-hub --token ${HF_TOKEN}" )q |
|
``` |
|
|
|
durée de l’entrainement : 1:07:41 |