Commit
·
f84fc22
1
Parent(s):
101d4d9
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,71 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- nl
|
5 |
+
metrics:
|
6 |
+
- f1
|
7 |
+
- exact_match
|
8 |
+
library_name: transformers
|
9 |
+
tags:
|
10 |
+
- dutch
|
11 |
+
- restaurant
|
12 |
+
- mt5
|
13 |
---
|
14 |
+
|
15 |
+
# Dutch-Restaurant-mT5-Small
|
16 |
+
|
17 |
+
|
18 |
+
The Dutch-Restaurant-mT5-Small model was introduced in [A Simple yet Effective Framework for Few-Shot Aspect-Based Sentiment Analysis (SIGIR'23)](https://doi.org/10.1145/3539618.3591940) by Zengzhi Wang, Qiming Xie, and Rui Xia.
|
19 |
+
|
20 |
+
The details are available at [Github:FS-ABSA](https://github.com/nustm/fs-absa) and [SIGIR'23 paper](https://doi.org/10.1145/3539618.3591940).
|
21 |
+
|
22 |
+
|
23 |
+
# Model Description
|
24 |
+
|
25 |
+
To bridge the domain gap between general pre-training and the task of interest in a specific domain (i.e., `restaurant` in this repo), we conducted *domain-adaptive pre-training*,
|
26 |
+
i.e., continuing pre-training the language model (i.e., mT5-small) on the unlabeled corpus of the domain of interest (i.e., `restaurant`) with the *text-infilling objective*
|
27 |
+
(corruption rate of 15% and average span length of 1). We collect relevant 100k unlabeled reviews from Yelp for the restaurant domain then translate them into dutch with the DeepL translator.
|
28 |
+
For pre-training, we employ the [Adafactor](https://arxiv.org/abs/1804.04235) optimizer with a batch size of 16 and a constant learning rate of 1e-4.
|
29 |
+
|
30 |
+
Our model can be seen as an enhanced T5 model in the restaurant domain, which can be used for various NLP tasks related to the restaurant domain,
|
31 |
+
including but not limited to fine-grained sentiment analysis (ABSA), product-relevant Question Answering (PrQA), text style transfer, etc.
|
32 |
+
|
33 |
+
|
34 |
+
|
35 |
+
```python
|
36 |
+
>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
37 |
+
|
38 |
+
>>> tokenizer = AutoTokenizer.from_pretrained("NUSTM/dutch-restaurant-mt5-small")
|
39 |
+
>>> model = AutoModelForSeq2SeqLM.from_pretrained("NUSTM/dutch-restaurant-mt5-small")
|
40 |
+
|
41 |
+
>>> input_ids = tokenizer(
|
42 |
+
... "The pizza here is delicious!!", return_tensors="pt"
|
43 |
+
... ).input_ids # Batch size 1
|
44 |
+
>>> outputs = model(input_ids=input_ids)
|
45 |
+
```
|
46 |
+
|
47 |
+
|
48 |
+
# Citation
|
49 |
+
|
50 |
+
If you find this work helpful, please cite our paper as follows:
|
51 |
+
|
52 |
+
```bibtex
|
53 |
+
@inproceedings{wang2023fs-absa,
|
54 |
+
author = {Wang, Zengzhi and Xie, Qiming and Xia, Rui},
|
55 |
+
title = {A Simple yet Effective Framework for Few-Shot Aspect-Based Sentiment Analysis},
|
56 |
+
year = {2023},
|
57 |
+
isbn = {9781450394086},
|
58 |
+
publisher = {Association for Computing Machinery},
|
59 |
+
address = {New York, NY, USA},
|
60 |
+
url = {https://doi.org/10.1145/3539618.3591940},
|
61 |
+
doi = {10.1145/3539618.3591940},
|
62 |
+
booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
|
63 |
+
numpages = {6},
|
64 |
+
location = {Taipei, Taiwan},
|
65 |
+
series = {SIGIR '23}
|
66 |
+
}
|
67 |
+
```
|
68 |
+
|
69 |
+
Note that the complete citation format will be announced once our paper is published in the SIGIR 2023 conference proceedings.
|
70 |
+
|
71 |
+
|