ozanarmagan commited on
Commit
e143fda
1 Parent(s): 7bc7521

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -0
README.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Typesense Public Embedding Models
2
+ We store our current supported embedding models in this repo and you can also convert your own models to ONNX format and create a PR to add it to our supported models list.
3
+
4
+ ### Convert a model to ONNX format
5
+
6
+ #### Converting a Hugging Face Transformers Model
7
+ You can follow instructions from [this link](https://huggingface.co/docs/transformers/serialization#export-to-onnx) to convert any model from Hugging Face to ONNX format using ```optimum-cli```.
8
+ #### Converting a PyTorch Model
9
+ You can use ```torch.onnx``` [APIs](https://pytorch.org/docs/stable/onnx.html) to convert PyTorch models to ONNX.
10
+ #### Converting a Tensorflow Model
11
+ You can use ```tf2onnx``` [tool](https://onnxruntime.ai/docs/tutorials/tf-get-started.html#getting-started-converting-tensorflow-to-onnx) to convert Tensorflow models to ONNX.
12
+
13
+ ### Creating model config
14
+ Before creating a PR with your ONNX model, you should store model file, vocab file and model config file under a folder with model name. Your model config must be named as ```config.json``` and should contain those keys:
15
+ | Key | Description | Optional |
16
+ |-----|-------------|----------|
17
+ |model_md5| MD5 checksum of model file as string| No |
18
+ |vocab_md5| MD5 checksum of vocab file as string| No |
19
+ |model_type| Model type (currently only ```bert``` and ```xlm_roberta``` supported)| No |
20
+ |vocab_file_name| File name of vocab file| No |
21
+ |indexing_prefix| Prefix to be added before embedding documents| Yes |
22
+ |query_prefix| Prefix to be added before embedding queries | Yes |