SinclairWang commited on
Commit
f84fc22
1 Parent(s): 101d4d9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -0
README.md CHANGED
@@ -1,3 +1,71 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - nl
5
+ metrics:
6
+ - f1
7
+ - exact_match
8
+ library_name: transformers
9
+ tags:
10
+ - dutch
11
+ - restaurant
12
+ - mt5
13
  ---
14
+
15
+ # Dutch-Restaurant-mT5-Small
16
+
17
+
18
+ The Dutch-Restaurant-mT5-Small model was introduced in [A Simple yet Effective Framework for Few-Shot Aspect-Based Sentiment Analysis (SIGIR'23)](https://doi.org/10.1145/3539618.3591940) by Zengzhi Wang, Qiming Xie, and Rui Xia.
19
+
20
+ The details are available at [Github:FS-ABSA](https://github.com/nustm/fs-absa) and [SIGIR'23 paper](https://doi.org/10.1145/3539618.3591940).
21
+
22
+
23
+ # Model Description
24
+
25
+ To bridge the domain gap between general pre-training and the task of interest in a specific domain (i.e., `restaurant` in this repo), we conducted *domain-adaptive pre-training*,
26
+ i.e., continuing pre-training the language model (i.e., mT5-small) on the unlabeled corpus of the domain of interest (i.e., `restaurant`) with the *text-infilling objective*
27
+ (corruption rate of 15% and average span length of 1). We collect relevant 100k unlabeled reviews from Yelp for the restaurant domain then translate them into dutch with the DeepL translator.
28
+ For pre-training, we employ the [Adafactor](https://arxiv.org/abs/1804.04235) optimizer with a batch size of 16 and a constant learning rate of 1e-4.
29
+
30
+ Our model can be seen as an enhanced T5 model in the restaurant domain, which can be used for various NLP tasks related to the restaurant domain,
31
+ including but not limited to fine-grained sentiment analysis (ABSA), product-relevant Question Answering (PrQA), text style transfer, etc.
32
+
33
+
34
+
35
+ ```python
36
+ >>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
37
+
38
+ >>> tokenizer = AutoTokenizer.from_pretrained("NUSTM/dutch-restaurant-mt5-small")
39
+ >>> model = AutoModelForSeq2SeqLM.from_pretrained("NUSTM/dutch-restaurant-mt5-small")
40
+
41
+ >>> input_ids = tokenizer(
42
+ ... "The pizza here is delicious!!", return_tensors="pt"
43
+ ... ).input_ids # Batch size 1
44
+ >>> outputs = model(input_ids=input_ids)
45
+ ```
46
+
47
+
48
+ # Citation
49
+
50
+ If you find this work helpful, please cite our paper as follows:
51
+
52
+ ```bibtex
53
+ @inproceedings{wang2023fs-absa,
54
+ author = {Wang, Zengzhi and Xie, Qiming and Xia, Rui},
55
+ title = {A Simple yet Effective Framework for Few-Shot Aspect-Based Sentiment Analysis},
56
+ year = {2023},
57
+ isbn = {9781450394086},
58
+ publisher = {Association for Computing Machinery},
59
+ address = {New York, NY, USA},
60
+ url = {https://doi.org/10.1145/3539618.3591940},
61
+ doi = {10.1145/3539618.3591940},
62
+ booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
63
+ numpages = {6},
64
+ location = {Taipei, Taiwan},
65
+ series = {SIGIR '23}
66
+ }
67
+ ```
68
+
69
+ Note that the complete citation format will be announced once our paper is published in the SIGIR 2023 conference proceedings.
70
+
71
+