Edit model card

BERT L6-H256 (uncased)

Mini BERT models from https://arxiv.org/abs/1908.08962 that the HF team didn't convert. The original conversion script is used.

See the original Google repo: google-research/bert

Note: it's not clear if these checkpoints have undergone knowledge distillation.

Model variants

Usage

See other BERT model cards e.g. https://huggingface.co/bert-base-uncased

Citation

@article{turc2019,
  title={Well-Read Students Learn Better: On the Importance of Pre-training Compact Models},
  author={Turc, Iulia and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
  journal={arXiv preprint arXiv:1908.08962v2 },
  year={2019}
}
Downloads last month
7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train gaunernst/bert-L6-H256-uncased

Collection including gaunernst/bert-L6-H256-uncased