File size: 2,450 Bytes
b2a5e43
85756d9
 
4ecf744
85756d9
 
 
 
3a811ce
 
b2a5e43
3a811ce
85756d9
 
 
 
 
 
 
 
 
963246d
 
 
 
 
 
 
 
 
 
 
 
bb530fb
963246d
 
 
 
 
 
 
 
 
 
85756d9
bb530fb
 
 
 
 
 
85756d9
 
d038a56
85756d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---
language: 
  - en
license: mit

tags:
  - text-classification

widget:
- text: "Why do we need an NFQA taxonomy?"
---

# Non Factoid Question Category classification in English
## NFQA model

Repository: [https://github.com/Lurunchik/NF-CATS](https://github.com/Lurunchik/NF-CATS)

Model trained with NFQA dataset. Base model is [roberta-base-squad2](https://huggingface.co/deepset/roberta-base-squad2), a RoBERTa-based model for the task of Question Answering, fine-tuned using the SQuAD2.0 dataset.

Uses `NOT-A-QUESTION`, `FACTOID`, `DEBATE`, `EVIDENCE-BASED`, `INSTRUCTION`, `REASON`, `EXPERIENCE`, `COMPARISON` labels.

## How to use NFQA cat with HuggingFace

##### Load NFQA cat and its tokenizer:
```python
from transformers import AutoTokenizer

from nfqa_model import RobertaNFQAClassification 

nfqa_model = RobertaNFQAClassification.from_pretrained("Lurunchik/nf-cats")
nfqa_tokenizer = AutoTokenizer.from_pretrained("deepset/roberta-base-squad2")
```

##### Make prediction using helper function:
```python
def get_nfqa_category_prediction(text):
    output = nfqa_model(**nfqa_tokenizer(text, return_tensors="pt"))
    index = output.logits.argmax()
    return nfqa_model.config.id2label[int(index)]

get_nfqa_category_prediction('how to assign category?')
# result
#'INSTRUCTION'
```

## Demo 
You can test the model via [hugginface space](https://huggingface.co/spaces/Lurunchik/nf-cats).

[![demo.png](demo.png)](https://huggingface.co/spaces/Lurunchik/nf-cats)


## Citation

If you use `NFQA-cats` in your work, please cite [this paper](https://dl.acm.org/doi/10.1145/3477495.3531926)

```
@misc{bolotova2022nfcats,
        author = {Bolotova, Valeriia and Blinov, Vladislav and Scholer, Falk and Croft, W. Bruce and Sanderson, Mark},
        title = {A Non-Factoid Question-Answering Taxonomy},
        year = {2022},
        isbn = {9781450387323},
        publisher = {Association for Computing Machinery},
        address = {New York, NY, USA},
        url = {https://doi.org/10.1145/3477495.3531926},
        doi = {10.1145/3477495.3531926},
        booktitle = {Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval},
        pages = {1196–1207},
        numpages = {12},
        keywords = {question taxonomy, non-factoid question-answering, editorial study, dataset analysis},
        location = {Madrid, Spain},
        series = {SIGIR '22}
}
```
Enjoy! 🤗