---
license: mit
language:
- en
---

# BERT-Small (uncased)
This is one of 24 smaller BERT models (English only, uncased, trained with WordPiece masking)
released by [google-research/bert](https://github.com/google-research/bert).

These BERT models was released as TensorFlow checkpoints, however, this is the converted version to PyTorch.
More information can be found in [google-research/bert](https://github.com/google-research/bert) or [lyeoni/convert-tf-to-pytorch](https://github.com/lyeoni/convert-tf-to-pytorch).

## Evaluation
Here are the evaluation scores (F1/Accuracy) for the MPRC task.
|Model|MRPC|
|-|:-:|
|BERT-Tiny|81.22/68.38|
|BERT-Mini|81.43/69.36|
|BERT-Small|81.41/70.34|
|BERT-Medium|83.33/73.53|
|BERT-Base|85.62/78.19|

### References
```
@article{turc2019,
  title={Well-Read Students Learn Better: On the Importance of Pre-training Compact Models},
  author={Turc, Iulia and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
  journal={arXiv preprint arXiv:1908.08962v2 },
  year={2019}
}
```