--- license: apache-2.0 datasets: - humarin/chatgpt-paraphrases language: - en tags: - paraphrase - similar text --- This model re-fine-tunes the [ChatGPT Paraphraser on T5 Base](https://huggingface.co/humarin/chatgpt_paraphraser_on_T5_base) with additional Google PAWS dataset. ## Usage example ```python from transformers import pipeline, AutoTokenizer, AutoModelForSeq2SeqLM #'cuda' for gpu otherwise use 'cpu' device = "cuda" model = AutoModelForSeq2SeqLM.from_pretrained("sharad/ParaphraseGPT").to(device) tokenizer = AutoTokenizer.from_pretrained("humarin/chatgpt_paraphraser_on_T5_base") predict = pipeline("text2text-generation", model=model, tokenizer=tokenizer) def paraphrase(sentence): generated = predict( sentence, num_beams=3, num_beam_groups=3, num_return_sequences=1, diversity_penalty=2.0, no_repeat_ngram_size=2, repetition_penalty=0.99, max_length=len(sentence) ) return generated output = paraphrase('My sentence to paraphrase...') print(output[0]['generated_text']) ``` ## Train parameters ```python epochs = 4 max_length = 128 lr = 5e-5 ```