TinyBERT_L-4_H-312_v2 ONNX Model
This repository provides an ONNX version of the TinyBERT_L-4_H-312_v2
model, originally developed by the team at Huawei Noah's Ark Lab
and ported to Transformers by Nils Reimers.
The model is a compact version of BERT, designed for efficient inference and reduced memory footprint. The ONNX version includes mean pooling of the last hidden layer for convenient feature extraction.
Model Overview
TinyBERT is a smaller version of BERT that maintains competitive performance while significantly reducing the number of parameters and computational cost. This makes it ideal for deployment in resource-constrained environments. The model is based on the work presented in the paper "TinyBERT: Distilling BERT for Natural Language Understanding".
License
This model is distributed under the Apache 2.0 License. For more details, please refer to the license file in the original repository.
Model Details
- Model: TinyBERT_L-4_H-312_v2
- Layers: 4
- Hidden Size: 312
- Pooling: Mean pooling of the last hidden layer
- Format: ONNX
Usage
To use this model, you will need to have onnxruntime
installed. You can install it via pip:
pip install onnxruntime, transformers
Below is a Python code snippet demonstrating how to run inference using this ONNX model:
import onnxruntime as ort
from transformers import AutoTokenizer
model_path="TinyBERT_L-4_H-312_v2-onnx/"
tokenizer = AutoTokenizer.from_pretrained(model_path)
ort_sess = ort.InferenceSession(model_path + "/tinybert_mean_embeddings.onnx")
features = tokenizer(['How many people live in Berlin?','Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'], padding=True, truncation=True, return_tensors="np")
onnx_inputs = {k: v for k, v in features.items() if k != 'token_type_ids'}
ort_outs = ort_sess.run(None, onnx_inputs)
print(ort_outs)
print("Mean pooled output:", mean_pooled_output)
Make sure to replace 'model_path'
with the actual path to your ONNX model file.
Training Details
For detailed information on the training process of TinyBERT, please refer to the original paper by Huawei Noah's Ark Lab.
Acknowledgements
This model is based on the work by the team at Huawei Noah's Ark Lab and by Nils Reimers. Special thanks to the developers for providing the pre-trained model and making it accessible to the community.