--- language: - en - fr - ro - de license: apache-2.0 library_name: transformers datasets: - c4 --- # Model Card for EncT5 EncT5 is a variant of T5 that utilizes mainly the encoder for non-autoregressive (ie. classification and regression) tasks. The model is from the paper [Fine-tuning T5 Encoder for Non-autoregressive Tasks](https://arxiv.org/abs/2110.08426) by Frederick Liu, Terry Huang, Shihang Lyu, Siamak Shakeri, Hongkun Yu, Jing Li ## Model Details ### Model Description EncT5 uses the same base weights at T5, but **must be fine-tuning before use**. There are several special features to EncT5: 1. There are less decoder layers (a single decoder layer by default), and so has fewer parameters/resources than the standard T5. 3. There is a separate decoder word embedding, with the decoder input ids being predefined constants. During fine-tuning, the decoder embedding learns to use these constants as "prompts" to the encoder for the corresponding classification/regression tasks. 5. There is a classification head on top of the decoder output. Research has shown that this model can be more efficient and usable over T5 and BERT for non-autoregressive tasks such as classification and regression. - **Developed by:** Frederick Liu, Terry Huang, Shihang Lyu, Siamak Shakeri, Hongkun Yu, Jing Li. See the [associated paper](https://arxiv.org/abs/2110.08426). - **Model type:** Language Model - **Language(s) (NLP):** English, French, Romanian, German - **License:** Apache 2.0 - **Based on model:** [T5](https://huggingface.co/google-t5/t5-base) - **Repository:** [Github repro](https://github.com/hackyon/EncT5) - **Paper:** [Fine-tuning T5 Encoder for Non-autoregressive Tasks](https://arxiv.org/abs/2110.08426) ## How to Get Started with the Model Use the code below to get started with the model. ```python model = AutoModelForSequenceClassification.from_pretrained("hackyon/enct5-base", trust_remote_code=True) # Fine-tune the model before use. ``` See the [github repro](https://github.com/hackyon/EncT5) for a more comprehensive guide. ## Training Details ### Training Data The weights of this model are directly copied from [t5-base](https://huggingface.co/google-t5/t5-base). ### Training Procedure This model **must be fine-tuned** before use. The decoder word embedding and classification head are both untrained.