Qdrant
/

bm25

Anush008 commited on Jul 10, 2024

Commit

be9738f

verified ·

1 Parent(s): 7ae3a4f

Added a model card.

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,3 +1,38 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- en
+pipeline_tag: sentence-similarity
+---
+Repository with files to perform BM25 searches with [FastEmbed](https://github.com/qdrant/fastembed).
+[BM25 (Best Matching 25)](https://en.wikipedia.org/wiki/Okapi_BM25) is a ranking function used by search engines to estimate the relevance of documents to a given search query.
+### Usage
+Here's an example of BM25 with [FastEmbed](https://github.com/qdrant/fastembed).
+```py
+from fastembed import SparseTextEmbedding
+documents = [
+    "You should stay, study and sprint.",
+    "History can only prepare us to be surprised yet again.",
+]
+model = SparseTextEmbedding(model_name="Qdrant/bm25")
+embeddings = list(embedding_model.embed(documents))
+# [
+#     SparseEmbedding(
+#         values=array([1.67419738, 1.67419738, 1.67419738, 1.67419738]),
+#         indices=array([171321964, 1881538586, 150760872, 1932363795])),
+#     SparseEmbedding(values=array(
+#         [1.66973021, 1.66973021, 1.66973021, 1.66973021, 1.66973021]),
+#                     indices=array([
+#                         578407224, 1849833631, 1008800696, 2090661150,
+#                         1117393019
+#                     ]))
+# ]
+```