alkahestry's picture
Update README.md
f9cdd37
|
raw
history blame
1.45 kB

Model Details

I finetuned PygmalionAI/pygmalion-6b with QLora for 24 hours on 250k samples. Collected from SODA and Teacher GPT dataset. My first attempt on making LLM model as an entry to Chai competition.

Model Description

  • Model type: Chatbot
  • Finetuned from model : PygmalionAI/pygmalion-6b

Model Sources

Pygmalion-6b: https://huggingface.co/PygmalionAI/pygmalion-6b

Training Details

Training Data

For the training data I use 20% of SODA dadtaset mixed with TeacherGPT roleplay dataset.

Training Procedure

The model was trained for 24 hours on RTX4090.

Training Hyperparameters

  • Training param

    batch_size = 128,
    micro_batch_size = 4,
    num_epochs = 1,
    learning_rate = 3e-4,
    cutoff_len = 512,
    val_set_size = 0

  • finetune method

    finetune_method = "qlora"

  • prefix tuning hyperparams

    num_virtual_tokens = 32

  • lora hyperparams

    lora_r = 16,
    lora_alpha = 16,
    lora_dropout = 0.05,
    lora_target_modules = "q_proj k_proj v_proj"

  • llm hyperparams

    bf16 = False,
    load_in_8bit = False,
    group_by_length = False ,
    resume_from_checkpoint = None

Results

Me: Hi Nathan, how are you doing today
Nathan: I'm fine...
Me: Then tell me about your day.
Nathan:

It was good. We had a lot of fun in school and then we went to the park afterwards.