Edit model card

Model description

The togethercomputer/RedPajama-INCITE-Base-3B-v1 model finetuned for Paraphrasing and Changing the Tone of the input sentence(to casual/professional/witty). Training data was generated using gpt-35-turbo.

Look at the repo llm-toys for usage and other details.

Try in colab: Open In Colab

Installation

pip install llm-toys
from llm_toys.tasks import Paraphraser

paraphraser = Paraphraser()
paraphraser.paraphrase("Hey, can yuo hepl me cancel my last order?")
# "Could you kindly assist me in canceling my previous order?"

paraphraser.paraphrase("Hey, can yuo hepl me cancel my last order?", tone="professional")
# "I would appreciate guidance on canceling my previous order."

paraphraser.paraphrase("Hey, can yuo hepl me cancel my last order?", tone="witty")
# "Hey, I need your help with my last order. Can you wave your magic wand and make it disappear?"

Sample training data

{
  "original": "If you have any further questions, feel free to ask.",
  "casual": "Got more questions? Feel free to ask away. I'm here to help!",
  "professional": "Should you have any additional inquiries, please don't hesitate to ask.",
  "witty": "Curiosity is always in style! If you have more mysteries to solve, I'm all ears!",
  "paraphrase": "Don't hesitate to ask if you have any more questions."
}

Training params

{
  "batch_size": 8,
  "eval_ratio": 0.1,
  "eval_steps": 100,
  "gradient_accumulation_steps": 1,
  "learning_rate": 0.0001,
  "logging_steps": 100,
  "lora_alpha": 32,
  "lora_dropout": 0.05,
  "lora_r": 16,
  "max_length": 128,
  "model_name": "togethercomputer/RedPajama-INCITE-Base-3B-v1",
  "num_train_epochs": 3,
  "seed": 10,
  "task_type": "paraphrase_tone",
  "use_aim": True
}

Training curve

train_eval_loss

Training procedure

The following bitsandbytes quantization config was used during training:

  • load_in_8bit: False
  • load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: nf4
  • bnb_4bit_use_double_quant: True
  • bnb_4bit_compute_dtype: bfloat16

Framework versions

  • PEFT 0.4.0.dev0
Downloads last month
176
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.