|
import streamlit as st |
|
from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline |
|
|
|
st.title('Question-Answering NLU') |
|
|
|
st.sidebar.title('Navigation') |
|
menu = st.sidebar.radio("", options=["Demo", "Parsing NLU data into SQuAD 2.0", "Training", |
|
"Evaluation"], index=0) |
|
|
|
|
|
if menu == "Demo": |
|
|
|
st.markdown(''' |
|
|
|
Question Answering NLU (QANLU) is an approach that maps the NLU task into question answering, |
|
leveraging pre-trained question-answering models to perform well on few-shot settings. Instead of |
|
training an intent classifier or a slot tagger, for example, we can ask the model intent- and |
|
slot-related questions in natural language: |
|
|
|
``` |
|
Context : I'm looking for a cheap flight to Boston. |
|
|
|
Question: Is the user looking to book a flight? |
|
Answer : Yes |
|
|
|
Question: Is the user asking about departure time? |
|
Answer : No |
|
|
|
Question: What price is the user looking for? |
|
Answer : cheap |
|
|
|
Question: Where is the user flying from? |
|
Answer : (empty) |
|
``` |
|
|
|
Thus, by asking questions for each intent and slot in natural language, we can effectively construct an NLU hypothesis. For more details, |
|
please read the paper: |
|
[Language model is all you need: Natural language understanding as question answering](https://assets.amazon.science/33/ea/800419b24a09876601d8ab99bfb9/language-model-is-all-you-need-natural-language-understanding-as-question-answering.pdf). |
|
|
|
In this Space, we will see how to transform an example |
|
NLU dataset (e.g. utterances and intent / slot annotations) into [SQuAD 2.0 format](https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/) |
|
question-answering data that can be used by QANLU. |
|
|
|
### Demo |
|
|
|
Feel free to query the pre-trained QA-NLU model using the buttons below. |
|
|
|
*Please note that this model has been trained on ATIS and may be need to be further fine-tuned to support intents and slots that are not covered in ATIS*. |
|
''') |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("AmazonScience/qanlu") |
|
|
|
model = AutoModelForQuestionAnswering.from_pretrained("AmazonScience/qanlu") |
|
|
|
qa_pipeline = pipeline('question-answering', model=model, tokenizer=tokenizer) |
|
|
|
context = st.text_input( |
|
'Please enter the context (remember to include "Yes. No. " in the beginning):', |
|
value="Yes. No. I want a cheap flight to Boston." |
|
) |
|
question = st.text_input( |
|
'Please enter the intent question:', |
|
value="Are they looking for a flight?" |
|
) |
|
|
|
|
|
qa_input = { |
|
'context': context, |
|
'question': question |
|
} |
|
|
|
if st.button('Ask QANLU'): |
|
answer = qa_pipeline(qa_input) |
|
st.write(answer) |
|
|
|
elif menu == "Parsing NLU data into SQuAD 2.0": |
|
st.header('QA-NLU Data Parsing') |
|
|
|
st.markdown(''' |
|
Here, we show a small example of how NLU data can be transformed into QANLU data. |
|
The same method can be used to transform [MATIS++](https://github.com/amazon-research/multiatis) |
|
NLU data (e.g. utterances and intent / slot annotations) into [SQuAD 2.0 format](https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/) |
|
question-answering data that can be used by QANLU. |
|
|
|
Here is an example dataset with three intents and two examples per intent: |
|
|
|
```` |
|
restaurant, I am looking for some Vietnamese food |
|
restaurant, What is there to eat around here? |
|
music, Play my workout playlist |
|
music, Can you find Bob Dylan songs? |
|
flight, Show me flights from Oakland to Dallas |
|
flight, I want two economy tickets from Miami to Chicago |
|
```` |
|
|
|
Now, we need to define some questions, per intent. We can use free-form questions or use templates. |
|
|
|
```` |
|
{ |
|
'restaurant': [ |
|
'Did they ask for a restaurant?', |
|
'Did they mention a restaurant?' |
|
], |
|
'music': [ |
|
'Did they ask for music?', |
|
'Do they want to play music?' |
|
], |
|
'flight': [ |
|
'Did they ask for a flight?', |
|
'Do they want to book a flight?' |
|
] |
|
} |
|
```` |
|
|
|
The next step is to run the `atis.py` script from the [QA-NLU Amazon Research repository](https://github.com/amazon-research/question-answering-nlu). |
|
That script will produce a json file that looks like this: |
|
|
|
```` |
|
{ |
|
"version": 1.0, |
|
"data": [ |
|
{ |
|
"title": "MultiATIS++", |
|
"paragraphs": [ |
|
{ |
|
"context": "yes. no. i am looking for some vietnamese food", |
|
"qas": [ |
|
{ |
|
"question": "did they ask for a restaurant?", |
|
"id": "49f1180cb9ce4178a8a90f76c21f69b4", |
|
"is_impossible": false, |
|
"answers": [ |
|
{ |
|
"text": "yes", |
|
"answer_start": 0 |
|
} |
|
], |
|
"slot": "", |
|
"intent": "restaurant" |
|
}, |
|
{ |
|
"question": "did they ask for music?", |
|
"id": "a7ffe039fb3e4843ae16d5a68194f45e", |
|
"is_impossible": false, |
|
"answers": [ |
|
{ |
|
"text": "no", |
|
"answer_start": 5 |
|
} |
|
], |
|
"slot": "", |
|
"intent": "restaurant" |
|
}, |
|
... <More questions> |
|
|
|
... <More paragraphs> |
|
```` |
|
|
|
There are many tunable parameters when generating the above file, such as how many negative examples to include per question. Follow the same process for training a slot-tagging model. |
|
|
|
''') |
|
|
|
elif menu == "Training": |
|
st.header('QA-NLU Training') |
|
|
|
st.markdown(''' |
|
To train a QA-NLU model on the data we created, we use the `run_squad.py` script from [huggingface](https://github.com/huggingface/transformers/blob/master/examples/legacy/question-answering/run_squad.py) and a SQuAD-trained QA model as our base. As an example, we can use `deepset/roberta-base-squad2` model from [here](https://huggingface.co/deepset/roberta-base-squad2) (assuming 8 GPUs are present): |
|
''') |
|
|
|
st.code(''' |
|
mkdir models |
|
|
|
python -m torch.distributed.launch --nproc_per_node=8 run_squad.py \\ |
|
--model_type roberta \\ |
|
--model_name_or_path deepset/roberta-base-squad2 \\ |
|
--do_train \\ |
|
--do_eval \\ |
|
--do_lower_case \\ |
|
--train_file data/matis_en_train_squad.json \\ |
|
--predict_file data/matis_en_test_squad.json \\ |
|
--learning_rate 3e-5 \\ |
|
--num_train_epochs 2 \\ |
|
--max_seq_length 384 \\ |
|
--doc_stride 64 \\ |
|
--output_dir models/qanlu/ \\ |
|
--per_gpu_train_batch_size 8 \\ |
|
--overwrite_output_dir \\ |
|
--version_2_with_negative \\ |
|
--save_steps 100000 \\ |
|
--gradient_accumulation_steps 8 \\ |
|
--seed $RANDOM |
|
''') |
|
|
|
elif menu == "Evaluation": |
|
st.header('QA-NLU Evaluation') |
|
|
|
st.markdown(''' |
|
To assess the performance of the trained model, we can use the `calculate_pr.py` script from the [QA-NLU Amazon Research repository](https://github.com/amazon-research/question-answering-nlu). |
|
|
|
Feel free to query the pre-trained QA-NLU model in the Demo section. |
|
''') |
|
|