Ezi commited on
Commit
c213a4b
1 Parent(s): 9487bd4

Model Card

Browse files

Hi!👋 This PR has a some additional information for the model card, based on the format we are using as part of our effort to standardise model cards at Hugging Face. Feel free to merge if you are ok with the changes! (cc

@Marissa



@Meg



@Nazneen

)

Files changed (1) hide show
  1. README.md +86 -4
README.md CHANGED
@@ -11,11 +11,61 @@ metrics:
11
 
12
  # DistilBERT base model (uncased)
13
 
14
- This is the [uncased DistilBERT model](https://huggingface.co/distilbert-base-uncased) fine-tuned on [Multi-Genre Natural Language Inference](https://huggingface.co/datasets/multi_nli) (MNLI) dataset for the zero-shot classification task. The model is not case-sensitive, i.e., it does not make a difference between "english" and "English".
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  ## Training
17
 
18
- Training is done on a [p3.2xlarge](https://aws.amazon.com/ec2/instance-types/p3/) AWS EC2 instance (1 NVIDIA Tesla V100 GPUs), with the following hyperparameters:
 
 
 
 
 
 
 
 
 
 
19
 
20
  ```
21
  $ run_glue.py \
@@ -30,8 +80,40 @@ $ run_glue.py \
30
  --output_dir /tmp/distilbert-base-uncased_mnli/
31
  ```
32
 
33
- ## Evaluation results
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  | Task | MNLI | MNLI-mm |
36
  |:----:|:----:|:----:|
37
- | | 82.0 | 82.0 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
  # DistilBERT base model (uncased)
13
 
14
+
15
+ ## Table of Contents
16
+ - [Model Details](#model-details)
17
+ - [How to Get Started With the Model](#how-to-get-started-with-the-model)
18
+ - [Uses](#uses)
19
+ - [Risks, Limitations and Biases](#risks-limitations-and-biases)
20
+ - [Training](#training)
21
+ - [Evaluation](#evaluation)
22
+ - [Environmental Impact](#environmental-impact)
23
+
24
+
25
+
26
+ ## Model Details
27
+ **Model Description:** This is the [uncased DistilBERT model](https://huggingface.co/distilbert-base-uncased) fine-tuned on [Multi-Genre Natural Language Inference](https://huggingface.co/datasets/multi_nli) (MNLI) dataset for the zero-shot classification task.
28
+ - **Developed by:** The [Typeform](https://www.typeform.com/) team.
29
+ - **Model Type:** Zero-Shot Classification
30
+ - **Language(s):** English
31
+ - **License:** Unknown
32
+ - **Parent Model:** See the [distilbert base uncased model](https://huggingface.co/distilbert-base-uncased) for more information about the Distilled-BERT base model.
33
+
34
+
35
+ ## How to Get Started with the Model
36
+
37
+ ```python
38
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
39
+
40
+ tokenizer = AutoTokenizer.from_pretrained("typeform/distilbert-base-uncased-mnli")
41
+
42
+ model = AutoModelForSequenceClassification.from_pretrained("typeform/distilbert-base-uncased-mnli")
43
+
44
+ ```
45
+
46
+ ## Uses
47
+ This model can be used for text classification tasks.
48
+
49
+
50
+ ## Risks, Limitations and Biases
51
+ **CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propagate historical and current stereotypes.**
52
+
53
+ Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)).
54
+
55
 
56
  ## Training
57
 
58
+ #### Training Data
59
+
60
+
61
+ This model of DistilBERT-uncased is pretrained on the Multi-Genre Natural Language Inference [(MultiNLI)](https://huggingface.co/datasets/multi_nli) corpus. It is a crowd-sourced collection of 433k sentence pairs annotated with textual entailment information. The corpus covers a range of genres of spoken and written text, and supports a distinctive cross-genre generalization evaluation.
62
+
63
+ This model is also **not** case-sensitive, i.e., it does not make a difference between "english" and "English".
64
+
65
+
66
+ #### Training Procedure
67
+
68
+ Training is done on a [p3.2xlarge](https://aws.amazon.com/ec2/instance-types/p3/) AWS EC2 with the following hyperparameters:
69
 
70
  ```
71
  $ run_glue.py \
 
80
  --output_dir /tmp/distilbert-base-uncased_mnli/
81
  ```
82
 
83
+ ## Evaluation
84
+
85
+
86
+ #### Evaluation Results
87
+ When fine-tuned on downstream tasks, this model achieves the following results:
88
+ - **Epoch = ** 5.0
89
+ - **Evaluation Accuracy =** 0.8206875508543532
90
+ - **Evaluation Loss =** 0.8706700205802917
91
+ - ** Evaluation Runtime = ** 17.8278
92
+ - ** Evaluation Samples per second = ** 551.498
93
+
94
+ MNLI and MNLI-mm results:
95
 
96
  | Task | MNLI | MNLI-mm |
97
  |:----:|:----:|:----:|
98
+ | | 82.0 | 82.0 |
99
+
100
+
101
+
102
+ ## Environmental Impact
103
+
104
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). We present the hardware type based on the [associated paper](https://arxiv.org/pdf/2105.09680.pdf).
105
+
106
+
107
+ **Hardware Type:** 1 NVIDIA Tesla V100 GPUs
108
+
109
+ **Hours used:** Unknown
110
+
111
+ **Cloud Provider:** AWS EC2 P3
112
+
113
+
114
+ **Compute Region:** Unknown
115
+
116
+
117
+
118
+ **Carbon Emitted:** (Power consumption x Time x Carbon produced based on location of power grid): Unknown
119
+