gsarti
/

mt5-base-informal-to-formal

Text2Text Generation

sequence-to-sequence

formality-style-transfer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mt5-base-informal-to-formal / README.md

gsarti's picture

Update README.md

c6b7de6 over 2 years ago

|

history blame contribute delete

3.25 kB

	---
	language:
	- it
	license: apache-2.0
	tags:
	- italian
	- sequence-to-sequence
	- style-transfer
	- formality-style-transfer
	datasets:
	- yahoo/xformal_it
	widget:
	- text: "maronn qualcuno mi spieg' CHECCOSA SUCCEDE?!?!"
	- text: "wellaaaaaaa, ma fraté sei proprio troppo simpatiko, grazieeee!!"
	- text: "nn capisco xke tt i ragazzi lo fanno"
	- text: "IT5 è SUPERMEGA BRAVISSIMO a capire tt il vernacolo italiano!!!"
	metrics:
	- rouge
	- bertscore
	model-index:
	- name: mt5-base-informal-to-formal
	results:
	- task:
	type: formality-style-transfer
	name: "Informal-to-formal Style Transfer"
	dataset:
	type: xformal_it
	name: "XFORMAL (Italian Subset)"
	metrics:
	- type: rouge1
	value: 0.661
	name: "Avg. Test Rouge1"
	- type: rouge2
	value: 0.471
	name: "Avg. Test Rouge2"
	- type: rougeL
	value: 0.642
	name: "Avg. Test RougeL"
	- type: bertscore
	value: 0.712
	name: "Avg. Test BERTScore"
	args:
	- model_type: "dbmdz/bert-base-italian-xxl-uncased"
	- lang: "it"
	- num_layers: 10
	- rescale_with_baseline: True
	- baseline_path: "bertscore_baseline_ita.tsv"
	co2_eq_emissions:
	emissions: "40g"
	source: "Google Cloud Platform Carbon Footprint"
	training_type: "fine-tuning"
	geographical_location: "Eemshaven, Netherlands, Europe"
	hardware_used: "1 TPU v3-8 VM"
	---

	# mT5 Base for Informal-to-formal Style Transfer 🧐

	This repository contains the checkpoint for the [mT5 Base](https://huggingface.co/google/mt5-base) model fine-tuned on Informal-to-formal style transfer on the Italian subset of the XFORMAL dataset as part of the experiments of the paper [IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation](https://arxiv.org/abs/2203.03759) by [Gabriele Sarti](https://gsarti.com) and [Malvina Nissim](https://malvinanissim.github.io).

	A comprehensive overview of other released materials is provided in the [gsarti/it5](https://github.com/gsarti/it5) repository. Refer to the paper for additional details concerning the reported scores and the evaluation approach.

	## Using the model

	Model checkpoints are available for usage in Tensorflow, Pytorch and JAX. They can be used directly with pipelines as:

	```python
	from transformers import pipelines

	i2f = pipeline("text2text-generation", model='it5/mt5-base-informal-to-formal')
	i2f("nn capisco xke tt i ragazzi lo fanno")
	>>> [{"generated_text": "non comprendo perché tutti i ragazzi agiscono così"}]
	```

	or loaded using autoclasses:

	```python
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	tokenizer = AutoTokenizer.from_pretrained("it5/mt5-base-informal-to-formal")
	model = AutoModelForSeq2SeqLM.from_pretrained("it5/mt5-base-informal-to-formal")
	```

	If you use this model in your research, please cite our work as:

	```bibtex
	@article{sarti-nissim-2022-it5,
	title={{IT5}: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation},
	author={Sarti, Gabriele and Nissim, Malvina},
	journal={ArXiv preprint 2203.03759},
	url={https://arxiv.org/abs/2203.03759},
	year={2022},
	month={mar}
	}
	```