nvidia
/

MM-Embed

Safetensors

English

llava_next

custom_code

Model card Files Files and versions Community

nada5 commited on Oct 30, 2024

Commit

ecbbc70

verified ·

1 Parent(s): a511d69

Update README.md

Browse files

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -4,10 +4,10 @@ language:
 license: cc-by-nc-4.0
 ---
 ## Introduction
-We introduce NV-MMEmbed-v1, an extension of NV-Embed-v1 with multimodal retrieval capability.
-NV-MMEmbed-v1 achieves state-of-the-art results in [UniIR benchmark](https://huggingface.co/TIGER-Lab/UniIR) with 52.7 averaged score compared to 48.9 (the best results in [UnIR benchmark paper](https://eccv.ecva.net/virtual/2024/poster/863)).
-Notably, NV-MMEmbed-v1 improves NV-Embed-v1 text retrieval accuracy, from 59.36 to 60.3 on 15 retrieval tasks within Massive Text Embedding Benchmark ([MTEB benchmark](https://arxiv.org/abs/2210.07316)).
-NV-MMEmbed-v1 presents several new training strategies, including modality-aware hard negative mining to improve multimodal retrieval accuracy in UniIR, and demonstrating a continual text-to-text fine-tuning method to further enhance the accuracy of text-to-text retrieval while maintaining mulitmodal retrieval accuracy.
 <!-- For more technical details, refer to our paper: [NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models](https://arxiv.org/pdf/2405.17428). -->
@@ -19,7 +19,7 @@ NV-MMEmbed-v1 presents several new training strategies, including modality-aware
 ## How to use
-Here are two examples of how to encode queries and passages using Huggingface-transformer. Please find the required package version [here](https://huggingface.co/nvidia/NV-MMEmbed-v1#1-required-packages). See more instructions in various retrieval scenario [here](https://huggingface.co/nvidia/NV-MMEmbed-v1/blob/main/instructions.json)
 ### Usage of Multimodal Retrieval (HuggingFace Transformers)
 ```python
@@ -135,6 +135,6 @@ pip install flash-attn==2.2.0
 pip install pillow
 ```
-#### 2. Access to model nvidia/NV-MMEmbed-v1 is restricted. You must be authenticated to access it
 Use your huggingface access [token](https://huggingface.co/settings/tokens) to execute *"huggingface-cli login"*.

 license: cc-by-nc-4.0
 ---
 ## Introduction
+We introduce MMEmbed, an extension of NV-Embed-v1 with multimodal retrieval capability.
+MMEmbed achieves state-of-the-art results in [UniIR benchmark](https://huggingface.co/TIGER-Lab/UniIR) with 52.7 averaged score compared to 48.9 (the best results in [UnIR benchmark paper](https://eccv.ecva.net/virtual/2024/poster/863)).
+Notably, MMEmbed improves NV-Embed-v1 text retrieval accuracy, from 59.36 to 60.3 on 15 retrieval tasks within Massive Text Embedding Benchmark ([MTEB benchmark](https://arxiv.org/abs/2210.07316)).
+MMEmbed presents several new training strategies, including modality-aware hard negative mining to improve multimodal retrieval accuracy in UniIR, and demonstrating a continual text-to-text fine-tuning method to further enhance the accuracy of text-to-text retrieval while maintaining mulitmodal retrieval accuracy.
 <!-- For more technical details, refer to our paper: [NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models](https://arxiv.org/pdf/2405.17428). -->
 ## How to use
+Here are two examples of how to encode queries and passages using Huggingface-transformer. Please find the required package version [here](https://huggingface.co/nvidia/MMEmbed#1-required-packages). See more instructions in various retrieval scenario [here](https://huggingface.co/nvidia/MMEmbed/blob/main/instructions.json)
 ### Usage of Multimodal Retrieval (HuggingFace Transformers)
 ```python
 pip install pillow
 ```
+#### 2. Access to model nvidia/MMEmbed is restricted. You must be authenticated to access it
 Use your huggingface access [token](https://huggingface.co/settings/tokens) to execute *"huggingface-cli login"*.