SinclairWang's picture
Update README.md
9c25c22
|
raw
history blame
2.79 kB
metadata
license: apache-2.0
language:
  - nl
metrics:
  - f1
  - exact_match
library_name: transformers
tags:
  - dutch
  - restaurant
  - mt5

Dutch-Restaurant-mT5-Small

The Dutch-Restaurant-mT5-Small model was introduced in A Simple yet Effective Framework for Few-Shot Aspect-Based Sentiment Analysis (SIGIR'23) by Zengzhi Wang, Qiming Xie, and Rui Xia.

The details are available at Github:FS-ABSA and SIGIR'23 paper.

Model Description

To bridge the domain (and lingual) gap between general pre-training and the task of interest in a specific domain (i.e., restaurant in this repo), we conducted domain-adaptive pre-training, i.e., continuing pre-training the language model (i.e., mT5-small) on the unlabeled corpus of the domain (and lingual) of interest (i.e., restaurant) with the text-infilling objective (corruption rate of 15% and average span length of 1). We collect relevant 100k unlabeled reviews from Yelp for the restaurant domain and then translate them into Dutch with the DeepL translator. For pre-training, we employ the Adafactor optimizer with a batch size of 16 and a constant learning rate of 1e-4.

Our model can be seen as an enhanced T5 model in the restaurant domain, which can be used for various NLP tasks related to the restaurant domain, including but not limited to fine-grained sentiment analysis (ABSA), product-relevant Question Answering (PrQA), text style transfer, etc.

>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

>>> tokenizer = AutoTokenizer.from_pretrained("NUSTM/dutch-restaurant-mt5-small")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("NUSTM/dutch-restaurant-mt5-small")

>>> input_ids = tokenizer(
...    "De pizza's hier zijn heerlijk!!!", return_tensors="pt"
... ).input_ids  # Batch size 1
>>> outputs = model(input_ids=input_ids)

Citation

If you find this work helpful, please cite our paper as follows:

@inproceedings{wang2023fs-absa,
author = {Wang, Zengzhi and Xie, Qiming and Xia, Rui},
title = {A Simple yet Effective Framework for Few-Shot Aspect-Based Sentiment Analysis},
year = {2023},
isbn = {9781450394086},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3539618.3591940},
doi = {10.1145/3539618.3591940},
booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
numpages = {6},
location = {Taipei, Taiwan},
series = {SIGIR '23}
}

Note that the complete citation format will be announced once our paper is published in the SIGIR 2023 conference proceedings.