Files changed (1) hide show
  1. README.md +81 -0
README.md CHANGED
@@ -2766,6 +2766,87 @@ print(similarities)
2766
 
2767
  The API comes with native INT8 and binary quantization support! Check out the [docs](https://mixedbread.ai/docs) for more information.
2768
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2769
  ## Evaluation
2770
  As of March 2024, our model archives SOTA performance for Bert-large sized models on the [MTEB](https://huggingface.co/spaces/mteb/leaderboard). It ourperforms commercial models like OpenAIs text-embedding-3-large and matches the performance of model 20x it's size like the [echo-mistral-7b](https://huggingface.co/jspringer/echo-mistral-7b-instruct-lasttoken). Our model was trained with no overlap of the MTEB data, which indicates that our model generalizes well across several domains, tasks and text length. We know there are some limitations with this model, which will be fixed in v2.
2771
 
 
2766
 
2767
  The API comes with native INT8 and binary quantization support! Check out the [docs](https://mixedbread.ai/docs) for more information.
2768
 
2769
+ ### Binary Matryoshka Representation Learning
2770
+
2771
+ We offer both MRL and binary quantization through our API, and they're also supported through Sentence Transformers. Please see examples on how to use these features below:
2772
+
2773
+ <CodeGroup title="Installation">
2774
+ ```bash {{title: "Python (API)"}}
2775
+ pip install -U mixedbread-ai
2776
+ ```
2777
+
2778
+ ```bash {{title: "JavaScript (API)"}}
2779
+ npm i @mixedbread-ai/sdk
2780
+ ```
2781
+
2782
+ ```bash {{title: "Python"}}
2783
+ pip install -U sentence-transformers
2784
+ ```
2785
+ </CodeGroup>
2786
+
2787
+ <CodeGroup title="Usage Example">
2788
+ ```python {{title: "Python (API)"}}
2789
+ from mixedbread_ai.client import MixedbreadAI
2790
+
2791
+ mxbai = MixedbreadAI(api_key="{MIXEDBREAD_API_KEY}")
2792
+
2793
+ res = mxbai.embeddings(
2794
+ model='mixedbread-ai/mxbai-embed-large-v1',
2795
+ input=[
2796
+ 'Who is german and likes bread?',
2797
+ 'Everybody in Germany.'
2798
+ ],
2799
+ normalized=True, # this has to be True if you want to use binary with faiss
2800
+ encoding_format=['ubinary', 'float'],
2801
+ dimensions=512
2802
+ )
2803
+
2804
+ res.data[0].embedding.ubinary
2805
+ res.data[0].embedding.float_
2806
+ res.dimensions
2807
+ ```
2808
+
2809
+ ```javascript {{title: "JavaScript (API)"}}
2810
+ import { MixedbreadAIClient } from "@mixedbread-ai/sdk";
2811
+
2812
+ const mxbai = new MixedbreadAIClient({
2813
+ apiKey: "{MIXEDBREAD_API_KEY}"
2814
+ });
2815
+
2816
+ const res = await mxbai.embeddings({
2817
+ model: 'mixedbread-ai/mxbai-embed-large-v1',
2818
+ input: [
2819
+ 'Who is german and likes bread?',
2820
+ 'Everybody in Germany.'
2821
+ ],
2822
+ normalized: true, // this has to be True if you want to use binary with faiss
2823
+ encoding_format: ['ubinary', 'float'],
2824
+ dimensions=512
2825
+ })
2826
+
2827
+ console.log(res.data[0].embedding.ubinary, res.data[0].embedding.float, res.dimensions)
2828
+ ```
2829
+
2830
+ ```python {{title: "Python"}}
2831
+ from sentence_transformers import SentenceTransformer
2832
+ from sentence_transformers.quantization import quantize_embeddings
2833
+
2834
+ # 1. Load an embedding model
2835
+ model = SentenceTransformer("mixedbread-ai/mxbai-embed-large-v1")
2836
+
2837
+ # 2. Encode some text and select MRL dimensions
2838
+ mrl_embeddings = model.encode(
2839
+ ["Who is german and likes bread?", "Everybody in Germany."], normalize_embeddings=True)[..., :512]
2840
+
2841
+ # 3. Apply binary quantization
2842
+ binary_embeddings = quantize_embeddings(mrl_embeddings, precision="binary")
2843
+ ```
2844
+ </CodeGroup>
2845
+
2846
+ ### Why is this relevant?
2847
+
2848
+ Binary MRL allows for significant savings in memory usage and therefore money when using a vector database. Read more about the technology and its advantages in our [blog post](https://www.mixedbread.ai/blog/binary-mrl).
2849
+
2850
  ## Evaluation
2851
  As of March 2024, our model archives SOTA performance for Bert-large sized models on the [MTEB](https://huggingface.co/spaces/mteb/leaderboard). It ourperforms commercial models like OpenAIs text-embedding-3-large and matches the performance of model 20x it's size like the [echo-mistral-7b](https://huggingface.co/jspringer/echo-mistral-7b-instruct-lasttoken). Our model was trained with no overlap of the MTEB data, which indicates that our model generalizes well across several domains, tasks and text length. We know there are some limitations with this model, which will be fixed in v2.
2852