prajjwal1 commited on
Commit
1a107d9
1 Parent(s): c670d7a

updated meta data

Browse files
Files changed (1) hide show
  1. README.md +16 -13
README.md CHANGED
@@ -1,3 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  The following model is a Pytorch pre-trained model obtained from converting Tensorflow checkpoint found in the [official Google BERT repository](https://github.com/google-research/bert).
2
 
3
  This is one of the smaller pre-trained BERT variants, together with [bert-mini](https://huggingface.co/prajjwal1/bert-mini), [bert-tiny](https://huggingface.co/prajjwal1/bert-tiny), [bert-small](https://huggingface.co/prajjwal1/bert-small) and [bert-medium](https://huggingface.co/prajjwal1/bert-medium). They were introduced in the study [Well-Read Students Learn Better: On the Importance of Pre-training Compact Models](https://arxiv.org/abs/1908.08962), and ported to HF for the study [Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics](https://arxiv.org/abs/2110.01518). These models are supposed to be trained on a downstream task.
@@ -43,16 +59,3 @@ Original Implementation and more info can be found in [this Github repository](h
43
 
44
  Twitter: [@prajjwal_1](https://twitter.com/prajjwal_1)
45
 
46
-
47
- ---
48
- language:
49
- - en
50
-
51
- tags:
52
- - BERT
53
- - MNLI
54
- - NLI
55
- - transformer
56
- - pre-training
57
-
58
- ---
 
1
+ ---
2
+ language:
3
+ - en
4
+
5
+ license:
6
+ - mit
7
+
8
+ tags:
9
+ - BERT
10
+ - MNLI
11
+ - NLI
12
+ - transformer
13
+ - pre-training
14
+
15
+ ---
16
+
17
  The following model is a Pytorch pre-trained model obtained from converting Tensorflow checkpoint found in the [official Google BERT repository](https://github.com/google-research/bert).
18
 
19
  This is one of the smaller pre-trained BERT variants, together with [bert-mini](https://huggingface.co/prajjwal1/bert-mini), [bert-tiny](https://huggingface.co/prajjwal1/bert-tiny), [bert-small](https://huggingface.co/prajjwal1/bert-small) and [bert-medium](https://huggingface.co/prajjwal1/bert-medium). They were introduced in the study [Well-Read Students Learn Better: On the Importance of Pre-training Compact Models](https://arxiv.org/abs/1908.08962), and ported to HF for the study [Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics](https://arxiv.org/abs/2110.01518). These models are supposed to be trained on a downstream task.
 
59
 
60
  Twitter: [@prajjwal_1](https://twitter.com/prajjwal_1)
61