Teja-Gollapudi commited on
Commit
2317081
1 Parent(s): 8841d1a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -5
README.md CHANGED
@@ -1,9 +1,58 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
4
- Int-8 dynamic quantized version of (VMware/tinyroberta-mrqa)[https://huggingface.co/VMware/tinyroberta-mrqa].
5
 
6
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  from optimum.onnxruntime import ORTModelForQuestionAnswering
8
  from transformers import pipeline, AutoTokenizer
9
 
@@ -13,11 +62,23 @@ quantized_model = ORTModelForQuestionAnswering.from_pretrained(model_name, file_
13
 
14
  qa_model = pipeline('question-answering', model=quantized_model, tokenizer=tokenizer)
15
 
16
- qa_input = {
17
- 'question': '',
18
- 'context': ''
19
  }
20
 
21
  qa_answers = qa_model(qa_input)
22
 
23
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - mrqa
5
+ language:
6
+ - en
7
+ metrics:
8
+ - exact_match
9
+ - f1
10
+
11
+
12
+ model-index:
13
+ - name: VMware/TinyRoBERTa-MRQA
14
+ results:
15
+ - task:
16
+ type: Extractive Question-Answering
17
+ dataset:
18
+ type: mrqa # Required. Example: common_voice. Use dataset id from https://hf.co/datasets
19
+ name: mrqa # Required. A pretty name for the dataset. Example: Common Voice (French)
20
+
21
+ metrics:
22
+ - type: exact_match # Required. Example: wer. Use metric id from https://hf.co/metrics
23
+ value: 69.21 # Required. Example: 20.90
24
+ name: Eval EM # Optional. Example: Test WER
25
+ - type: f1 # Required. Example: wer. Use metric id from https://hf.co/metrics
26
+ value: 79.65 # Required. Example: 20.90
27
+ name: Eval F1 # Optional. Example: Test WER
28
+ - type: exact_match # Required. Example: wer. Use metric id from https://hf.co/metrics
29
+ value: 52.8 # Required. Example: 20.90
30
+ name: Test EM # Optional. Example: Test WER
31
+ - type: f1 # Required. Example: wer. Use metric id from https://hf.co/metrics
32
+ value: 63.4 # Required. Example: 20.90
33
+ name: Test F1 # Optional. Example: Test WER
34
  ---
 
35
 
36
+ # VMware/TinyRoBERTa-quantized-mrqa
37
+
38
+ Int-8 dynamic quantized version of [VMware/tinyroberta-mrqa](https://huggingface.co/VMware/tinyroberta-mrqa).
39
+
40
+
41
+ ## Overview
42
+ - **Model name:** tinyroberta-quantized-mrqa
43
+ - **Model type:** Extractive Question Answering
44
+ - **Teacher Model:** [VMware/roberta-large-mrqa](https://huggingface.co/VMware/roberta-large-mrqa)
45
+ - **Full Precision Model:** [VMware/tinyroberta-mrqa](https://huggingface.co/VMware/tinyroberta-mrqa)
46
+ - **Training dataset:** [MRQA](https://huggingface.co/datasets/mrqa) (Machine Reading for Question Answering)
47
+ - **Training data size:** 516,819 examples
48
+ - **Language:** English
49
+ - **Framework:** ONNX
50
+ - **Model version:** 1.0
51
+
52
+ ## Usage
53
+
54
+ ### In Transformers
55
+ ```python
56
  from optimum.onnxruntime import ORTModelForQuestionAnswering
57
  from transformers import pipeline, AutoTokenizer
58
 
 
62
 
63
  qa_model = pipeline('question-answering', model=quantized_model, tokenizer=tokenizer)
64
 
65
+ QA_input = {
66
+ context = "We present the results of the Machine Reading for Question Answering (MRQA) 2019 shared task on evaluating the generalization capabilities of reading comprehension systems. In this task, we adapted and unified 18 distinct question answering datasets into the same format. Among them, six datasets were made available for training, six datasets were made available for development, and the final six were hidden for final evaluation. Ten teams submitted systems, which explored various ideas including data sampling, multi-task learning, adversarial training and ensembling. The best system achieved an average F1 score of 72.5 on the 12 held-out datasets, 10.7 absolute points higher than our initial baseline based on BERT."
67
+ question = "What is MRQA?"
68
  }
69
 
70
  qa_answers = qa_model(qa_input)
71
 
72
  ```
73
+
74
+ # Limitations and Bias
75
+
76
+ The model is based on a large and diverse dataset, but it may still have limitations and biases in certain areas. Some limitations include:
77
+
78
+ - Language: The model is designed to work with English text only and may not perform as well on other languages.
79
+
80
+ - Domain-specific knowledge: The model has been trained on a general dataset and may not perform well on questions that require domain-specific knowledge.
81
+
82
+ - Out-of-distribution questions: The model may struggle with questions that are outside the scope of the MRQA dataset. This is best demonstrated by the delta between its scores on the eval vs test datasets.
83
+
84
+ In addition, the model may have some bias in terms of the data it was trained on. The dataset includes questions from a variety of sources, but it may not be representative of all populations or perspectives. As a result, the model may perform better or worse for certain types of questions or on certain types of texts.