Spaces:

AmazonScience
/

QA-NLU

Runtime error

App Files Files Community

QA-NLU / app.py

alexpap

Update app.py

9146844 over 3 years ago

raw

history blame contribute delete

8.35 kB

	import streamlit as st
	from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline

	st.title('Question-Answering NLU')

	st.sidebar.title('Navigation')
	menu = st.sidebar.radio("", options=["Demo", "Parsing NLU data into SQuAD 2.0", "Training",
	"Evaluation"], index=0)


	if menu == "Demo":

	st.markdown('''

	Question Answering NLU (QANLU) is an approach that maps the NLU task into question answering,
	leveraging pre-trained question-answering models to perform well on few-shot settings. Instead of
	training an intent classifier or a slot tagger, for example, we can ask the model intent- and
	slot-related questions in natural language:

	```
	Context : I'm looking for a cheap flight to Boston.

	Question: Is the user looking to book a flight?
	Answer : Yes

	Question: Is the user asking about departure time?
	Answer : No

	Question: What price is the user looking for?
	Answer : cheap

	Question: Where is the user flying from?
	Answer : (empty)
	```

	Thus, by asking questions for each intent and slot in natural language, we can effectively construct an NLU hypothesis. For more details,
	please read the paper:
	[Language model is all you need: Natural language understanding as question answering](https://assets.amazon.science/33/ea/800419b24a09876601d8ab99bfb9/language-model-is-all-you-need-natural-language-understanding-as-question-answering.pdf).

	In this Space, we will see how to transform an example
	NLU dataset (e.g. utterances and intent / slot annotations) into [SQuAD 2.0 format](https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/)
	question-answering data that can be used by QANLU.

	### Demo

	Feel free to query the pre-trained QA-NLU model using the buttons below.

	Please note that this model has been trained on ATIS and may be need to be further fine-tuned to support intents and slots that are not covered in ATIS.
	''')

	tokenizer = AutoTokenizer.from_pretrained("AmazonScience/qanlu")

	model = AutoModelForQuestionAnswering.from_pretrained("AmazonScience/qanlu")

	qa_pipeline = pipeline('question-answering', model=model, tokenizer=tokenizer)

	context = st.text_input(
	'Please enter the context (remember to include "Yes. No. " in the beginning):',
	value="Yes. No. I want a cheap flight to Boston."
	)
	question = st.text_input(
	'Please enter the intent question:',
	value="Are they looking for a flight?"
	)


	qa_input = {
	'context': context,
	'question': question
	}

	if st.button('Ask QANLU'):
	answer = qa_pipeline(qa_input)
	st.write(answer)

	elif menu == "Parsing NLU data into SQuAD 2.0":
	st.header('QA-NLU Data Parsing')

	st.markdown('''
	Here, we show a small example of how NLU data can be transformed into QANLU data.
	The same method can be used to transform [MATIS++](https://github.com/amazon-research/multiatis)
	NLU data (e.g. utterances and intent / slot annotations) into [SQuAD 2.0 format](https://rajpurkar.github.io/SQuAD-explorer/explore/v2.0/dev/)
	question-answering data that can be used by QANLU.

	Here is an example dataset with three intents and two examples per intent:

	````
	restaurant, I am looking for some Vietnamese food
	restaurant, What is there to eat around here?
	music, Play my workout playlist
	music, Can you find Bob Dylan songs?
	flight, Show me flights from Oakland to Dallas
	flight, I want two economy tickets from Miami to Chicago
	````

	Now, we need to define some questions, per intent. We can use free-form questions or use templates.

	````
	{
	'restaurant': [
	'Did they ask for a restaurant?',
	'Did they mention a restaurant?'
	],
	'music': [
	'Did they ask for music?',
	'Do they want to play music?'
	],
	'flight': [
	'Did they ask for a flight?',
	'Do they want to book a flight?'
	]
	}
	````

	The next step is to run the `atis.py` script from the [QA-NLU Amazon Research repository](https://github.com/amazon-research/question-answering-nlu).
	That script will produce a json file that looks like this:

	````
	{
	"version": 1.0,
	"data": [
	{
	"title": "MultiATIS++",
	"paragraphs": [
	{
	"context": "yes. no. i am looking for some vietnamese food",
	"qas": [
	{
	"question": "did they ask for a restaurant?",
	"id": "49f1180cb9ce4178a8a90f76c21f69b4",
	"is_impossible": false,
	"answers": [
	{
	"text": "yes",
	"answer_start": 0
	}
	],
	"slot": "",
	"intent": "restaurant"
	},
	{
	"question": "did they ask for music?",
	"id": "a7ffe039fb3e4843ae16d5a68194f45e",
	"is_impossible": false,
	"answers": [
	{
	"text": "no",
	"answer_start": 5
	}
	],
	"slot": "",
	"intent": "restaurant"
	},
	... <More questions>

	... <More paragraphs>
	````

	There are many tunable parameters when generating the above file, such as how many negative examples to include per question. Follow the same process for training a slot-tagging model.

	''')

	elif menu == "Training":
	st.header('QA-NLU Training')

	st.markdown('''
	To train a QA-NLU model on the data we created, we use the `run_squad.py` script from [huggingface](https://github.com/huggingface/transformers/blob/master/examples/legacy/question-answering/run_squad.py) and a SQuAD-trained QA model as our base. As an example, we can use `deepset/roberta-base-squad2` model from [here](https://huggingface.co/deepset/roberta-base-squad2) (assuming 8 GPUs are present):
	''')

	st.code('''
	mkdir models

	python -m torch.distributed.launch --nproc_per_node=8 run_squad.py \\
	--model_type roberta \\
	--model_name_or_path deepset/roberta-base-squad2 \\
	--do_train \\
	--do_eval \\
	--do_lower_case \\
	--train_file data/matis_en_train_squad.json \\
	--predict_file data/matis_en_test_squad.json \\
	--learning_rate 3e-5 \\
	--num_train_epochs 2 \\
	--max_seq_length 384 \\
	--doc_stride 64 \\
	--output_dir models/qanlu/ \\
	--per_gpu_train_batch_size 8 \\
	--overwrite_output_dir \\
	--version_2_with_negative \\
	--save_steps 100000 \\
	--gradient_accumulation_steps 8 \\
	--seed $RANDOM
	''')

	elif menu == "Evaluation":
	st.header('QA-NLU Evaluation')

	st.markdown('''
	To assess the performance of the trained model, we can use the `calculate_pr.py` script from the [QA-NLU Amazon Research repository](https://github.com/amazon-research/question-answering-nlu).

	Feel free to query the pre-trained QA-NLU model in the Demo section.
	''')