metadata

tags:
  - autotrain
  - text-generation-inference
  - text-generation
  - peft
  - Phi 3
library_name: transformers
widget:
  - messages:
      - role: user
        content: What is your favorite condiment?
license: other
language:
  - en

Model Trained Using AutoTrain

This model was trained using AutoTrain. For more information, please visit AutoTrain.

Usage


import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

torch.random.manual_seed(0)

model = AutoModelForCausalLM.from_pretrained(
    "styalai/competition-math-phinetune-v1", q
    device_map="cuda", 
    torch_dtype="auto", 
    trust_remote_code=True, 
)
tokenizer = AutoTokenizer.from_pretrained("styalai/competition-math-phinetune-v1")

messages = [
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

generation_args = {
    "max_new_tokens": 500,
    "return_full_text": False,
    "temperature": 0.0,
    "do_sample": False,
}

output = pipe(messages, **generation_args)
print(output[0]['generated_text'])

Info

Fine-tune from styalai/phi-ne-tuning-1-4 who it fine tune from phi-3

parameters of autotrain :

project_name = 'competition-math-phinetune-v1' # @param {type:"string"}
model_name = "styalai/phi-ne-tuning-1-4" #'microsoft/Phi-3-mini-4k-instruct' # @param {type:"string"}

#@markdown ---
#@markdown #### Push to Hub?
#@markdown Use these only if you want to push your trained model to a private repo in your Hugging Face Account
#@markdown If you dont use these, the model will be saved in Google Colab and you are required to download it manually.
#@markdown Please enter your Hugging Face write token. The trained model will be saved to your Hugging Face account.
#@markdown You can find your token here: https://huggingface.co/settings/tokens
push_to_hub = True # @param ["False", "True"] {type:"raw"}
hf_token = "hf_****" #@param {type:"string"}
#repo_id = "styalai/phine_tuning_1" #@param {type:"string"}

#@markdown ---
#@markdown #### Hyperparameters
learning_rate = 3e-4 # @param {type:"number"}
num_epochs = 1 #@param {type:"number"}
batch_size = 1 # @param {type:"slider", min:1, max:32, step:1}
block_size = 1024 # @param {type:"number"}
trainer = "sft" # @param ["default", "sft"] {type:"raw"}
warmup_ratio = 0.1 # @param {type:"number"}
weight_decay = 0.01 # @param {type:"number"}
gradient_accumulation = 4 # @param {type:"number"}
mixed_precision = "fp16" # @param ["fp16", "bf16", "none"] {type:"raw"}
peft = True # @param ["False", "True"] {type:"raw"}
quantization = "int4" # @param ["int4", "int8", "none"] {type:"raw"}
lora_r = 16 #@param {type:"number"}
lora_alpha = 32 #@param {type:"number"}
lora_dropout = 0.05 #@param {type:"number"}

code for the creation of the dataset :
from datasets import load_dataset
dataset = load_dataset("camel-ai/math")#, streaming=True)

import pandas as pd
data = {"text":[]}

msg1 = dataset["train"]["message_1"]
msg2 = dataset["train"]["message_2"]

for i in range(3500):
    user = "<|user|>"+ msg1[i] +"<|end|>\n"
    phi = "<|assistant|>"+ msg2[i] +"<|end|>"
    prompt = user+phi
    data["text"].append(prompt)
    
data = pd.DataFrame.from_dict(data)
print(data)
#os.mkdir("/kaggle/working/data")
data.to_csv('data/dataset.csv', index=False, escapechar='\\')

!autotrain llm \
--train \
--username "styalai" \
--merge-adapter \
--model ${MODEL_NAME} \
--project-name ${PROJECT_NAME} \
--data-path data/ \
--text-column text \
--lr ${LEARNING_RATE} \
--batch-size ${BATCH_SIZE} \
--epochs ${NUM_EPOCHS} \
--block-size ${BLOCK_SIZE} \
--warmup-ratio ${WARMUP_RATIO} \
--lora-r ${LORA_R} \
--lora-alpha ${LORA_ALPHA} \
--lora-dropout ${LORA_DROPOUT} \
--weight-decay ${WEIGHT_DECAY} \
--gradient-accumulation ${GRADIENT_ACCUMULATION} \
--quantization ${QUANTIZATION} \
--mixed-precision ${MIXED_PRECISION} \
$( [[ "$PEFT" == "True" ]] && echo "--peft" ) \
$( [[ "$PUSH_TO_HUB" == "True" ]] && echo "--push-to-hub --token ${HF_TOKEN}" )q

durée de l’entrainement : 1:07:41