mys
/

bert-base-turkish-cased-nli-mean-faq-mnr

Feature Extraction

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

mys commited on Nov 24, 2021

Commit

09a3a3a

•

1 Parent(s): 8df1b27

Fix further bugs in styling

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -3,7 +3,7 @@
 Google supported this work by providing Google Cloud credit. Thank you Google for supporting the open source! 🎉
 ## Model
-This is a finetuned version of [mys/bert-base-turkish-cased-nli-mean](https://huggingface.co/) for FAQ retrieval, which is itself a finetuned version of [dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased) for NLI. It maps questions & answers to 768 dimensional vectors to be used for FAQ-style chatbots and answer retrieval in question-answering pipelines. It was trained on the Turkish subset of [clips/mqa](https://huggingface.co/datasets/clips/mqa) dataset after some cleaning/ filtering and with a Multiple Negatives Symmetric Ranking loss. Before finetuning, I added two special tokens to the tokenizer (i.e., <Q> for questions and <A> for answers) and resized the model embeddings, so you need to prepend the relevant tokens to the sequences before feeding them into the model. Please have a look at [my accompanying repo](https://github.com/monatis/trfaq) to see how it was finetuned and how it can be used in inference. The following code snippet is an excerpt from the inference at the repo.
 ## Usage
 ```python

 Google supported this work by providing Google Cloud credit. Thank you Google for supporting the open source! 🎉
 ## Model
+This is a finetuned version of [mys/bert-base-turkish-cased-nli-mean](https://huggingface.co/) for FAQ retrieval, which is itself a finetuned version of [dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased) for NLI. It maps questions & answers to 768 dimensional vectors to be used for FAQ-style chatbots and answer retrieval in question-answering pipelines. It was trained on the Turkish subset of [clips/mqa](https://huggingface.co/datasets/clips/mqa) dataset after some cleaning/ filtering and with a Multiple Negatives Symmetric Ranking loss. Before finetuning, I added two special tokens to the tokenizer (i.e., `<Q>` for questions and `<A>` for answers) and resized the model embeddings, so you need to prepend the relevant tokens to the sequences before feeding them into the model. Please have a look at [my accompanying repo](https://github.com/monatis/trfaq) to see how it was finetuned and how it can be used in inference. The following code snippet is an excerpt from the inference at the repo.
 ## Usage
 ```python