nreimers
Model
0961f66

BERT-Small-L-4_H-512_A-8

This is a port of the BERT-Small model to Pytorch. It uses 4 layers, a hidden size of 512 and 8 attention heads.