Formal Language T5 Model
This model is fine-tuned from T5-base for formal language correction and text formalization.
Model Description
- Model Type: T5-base fine-tuned
- Language: English
- Task: Text Formalization and Grammar Correction
- License: Apache 2.0
- Base Model: t5-base
Intended Uses & Limitations
Intended Uses
- Converting informal text to formal language
- Improving text professionalism
- Grammar correction
- Business communication enhancement
- Academic writing improvement
Limitations
- Works best with English text
- Maximum input length: 128 tokens
- May not preserve specific domain terminology
- Best suited for business and academic contexts
Usage
from transformers import AutoModelForSeq2SeqGeneration, AutoTokenizer
model = AutoModelForSeq2SeqGeneration.from_pretrained("renix-codex/formal-lang-rxcx-model")
tokenizer = AutoTokenizer.from_pretrained("renix-codex/formal-lang-rxcx-model")
# Example usage
text = "make formal: hey whats up"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs)
formal_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
Example Inputs and Outputs
Informal Input | Formal Output |
---|---|
"hey whats up" | "Hello, how are you?" |
"gonna be late for meeting" | "I will be late for the meeting." |
"this is kinda cool" | "This is quite impressive." |
Training
The model was trained on the Grammarly/COEDIT dataset with the following specifications:
- Base Model: T5-base
- Training Hardware: A100 GPU
- Sequence Length: 128 tokens
- Input Format: "make formal: [informal text]"
License
Apache License 2.0
Citation
@misc{formal-lang-rxcx-model,
author = {renix-codex},
title = {Formal Language T5 Model},
year = {2024},
publisher = {HuggingFace},
journal = {HuggingFace Model Hub},
url = {https://huggingface.co/renix-codex/formal-lang-rxcx-model}
}
Developer
Model developed by renix-codex
Ethical Considerations
This model is intended to assist in formal writing while maintaining the original meaning of the text. Users should be aware that:
- The model may alter the tone of personal or culturally specific expressions
- It should be used as a writing aid rather than a replacement for human judgment
- The output should be reviewed for accuracy and appropriateness
Updates and Versions
Initial Release - February 2024
- Base implementation with T5-base
- Trained on Grammarly/COEDIT dataset
- Optimized for formal language conversion
- Downloads last month
- 26
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for renix-codex/formal-lang-rxcx-model
Base model
google-t5/t5-baseDataset used to train renix-codex/formal-lang-rxcx-model
Evaluation results
- training_loss on grammarly/coeditself-reported2.100
- rouge1 on grammarly/coeditself-reported0.850
- accuracy on grammarly/coeditself-reported0.820