models / README.md
ozanarmagan's picture
Create README.md
e143fda
|
raw
history blame
No virus
1.55 kB
## Typesense Public Embedding Models
We store our current supported embedding models in this repo and you can also convert your own models to ONNX format and create a PR to add it to our supported models list.
### Convert a model to ONNX format
#### Converting a Hugging Face Transformers Model
You can follow instructions from [this link](https://huggingface.co/docs/transformers/serialization#export-to-onnx) to convert any model from Hugging Face to ONNX format using ```optimum-cli```.
#### Converting a PyTorch Model
You can use ```torch.onnx``` [APIs](https://pytorch.org/docs/stable/onnx.html) to convert PyTorch models to ONNX.
#### Converting a Tensorflow Model
You can use ```tf2onnx``` [tool](https://onnxruntime.ai/docs/tutorials/tf-get-started.html#getting-started-converting-tensorflow-to-onnx) to convert Tensorflow models to ONNX.
### Creating model config
Before creating a PR with your ONNX model, you should store model file, vocab file and model config file under a folder with model name. Your model config must be named as ```config.json``` and should contain those keys:
| Key | Description | Optional |
|-----|-------------|----------|
|model_md5| MD5 checksum of model file as string| No |
|vocab_md5| MD5 checksum of vocab file as string| No |
|model_type| Model type (currently only ```bert``` and ```xlm_roberta``` supported)| No |
|vocab_file_name| File name of vocab file| No |
|indexing_prefix| Prefix to be added before embedding documents| Yes |
|query_prefix| Prefix to be added before embedding queries | Yes |