Spaces:

AmazonScience
/

QA-NLU

Runtime error

App Files Files Community

alexpap commited on Nov 2, 2021

Commit

24f7c24

•

1 Parent(s): a896ab1

Update app.py

Browse files

Files changed (1) hide show

app.py +42 -1

app.py CHANGED Viewed

@@ -44,6 +44,8 @@ if menu == "Introduction":
     ''')
 elif menu == "Parsing NLU data into SQuAD 2.0":
     st.markdown('''
         Here, we show a small example of how NLU data can be transformed into QANLU data.
         The same method can be used to transform [MATIS++](https://github.com/amazon-research/multiatis)
@@ -120,15 +122,54 @@ elif menu == "Parsing NLU data into SQuAD 2.0":
                                 "intent": "restaurant"
                             },
                             ... <More questions>
         ````
         There are many tunable parameters when generating the above file, such as how many negative examples to include per question. Follow the same process for training a slot-tagging model.
     ''')
 elif menu == "Evaluation":
-    st.header('QANLU Evaluation')
     tokenizer = AutoTokenizer.from_pretrained("AmazonScience/qanlu", use_auth_token=True)
     model = AutoModelForQuestionAnswering.from_pretrained("AmazonScience/qanlu", use_auth_token=True)

     ''')
 elif menu == "Parsing NLU data into SQuAD 2.0":
+    st.header('QA-NLU Data Parsing')
     st.markdown('''
         Here, we show a small example of how NLU data can be transformed into QANLU data.
         The same method can be used to transform [MATIS++](https://github.com/amazon-research/multiatis)
                                 "intent": "restaurant"
                             },
                             ... <More questions>
+                ... <More paragraphs>
         ````
         There are many tunable parameters when generating the above file, such as how many negative examples to include per question. Follow the same process for training a slot-tagging model.
     ''')
+elif menu == "Training":
+    st.header('QA-NLU Training')
+    st.markdown('''
+        To train a QA-NLU model on the data we created, we use the `run_squad.py` script from [huggingface](https://github.com/huggingface/transformers/blob/master/examples/legacy/question-answering/run_squad.py) and a SQuAD-trained QA model as our base. As an example, we can use `deepset/roberta-base-squad2` model from [here](https://huggingface.co/deepset/roberta-base-squad2) (assuming 8 GPUs are present):
+        ```
+        mkdir models
+        python -m torch.distributed.launch --nproc_per_node=8 run_squad.py \
+            --model_type roberta \
+            --model_name_or_path deepset/roberta-base-squad2 \
+            --do_train \
+            --do_eval \
+            --do_lower_case \
+            --train_file data/matis_en_train_squad.json \
+            --predict_file data/matis_en_test_squad.json \
+            --learning_rate 3e-5 \
+            --num_train_epochs 2 \
+            --max_seq_length 384 \
+            --doc_stride 64 \
+            --output_dir models/qanlu/ \
+            --per_gpu_train_batch_size 8 \
+            --overwrite_output_dir \
+            --version_2_with_negative \
+            --save_steps 100000 \
+            --gradient_accumulation_steps 8 \
+            --seed $RANDOM
+        ```
+    ''')
 elif menu == "Evaluation":
+    st.header('QA-NLU Evaluation')
+    st.markdown('''
+        To assess the performance of the trained model, we can use the `calculate_pr.py` script from the [QA-NLU Amazon Research repository](https://github.com/amazon-research/question-answering-nlu).
+        Feel free to query the pre-trained QA-NLU model using the buttons below.
+    ''')
     tokenizer = AutoTokenizer.from_pretrained("AmazonScience/qanlu", use_auth_token=True)
     model = AutoModelForQuestionAnswering.from_pretrained("AmazonScience/qanlu", use_auth_token=True)