Update README.md
Browse files
README.md
CHANGED
@@ -4,10 +4,10 @@ language:
|
|
4 |
license: cc-by-nc-4.0
|
5 |
---
|
6 |
## Introduction
|
7 |
-
We introduce
|
8 |
-
|
9 |
-
Notably,
|
10 |
-
|
11 |
|
12 |
<!-- For more technical details, refer to our paper: [NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models](https://arxiv.org/pdf/2405.17428). -->
|
13 |
|
@@ -19,7 +19,7 @@ NV-MMEmbed-v1 presents several new training strategies, including modality-aware
|
|
19 |
|
20 |
## How to use
|
21 |
|
22 |
-
Here are two examples of how to encode queries and passages using Huggingface-transformer. Please find the required package version [here](https://huggingface.co/nvidia/
|
23 |
|
24 |
### Usage of Multimodal Retrieval (HuggingFace Transformers)
|
25 |
```python
|
@@ -135,6 +135,6 @@ pip install flash-attn==2.2.0
|
|
135 |
pip install pillow
|
136 |
```
|
137 |
|
138 |
-
#### 2. Access to model nvidia/
|
139 |
|
140 |
Use your huggingface access [token](https://huggingface.co/settings/tokens) to execute *"huggingface-cli login"*.
|
|
|
4 |
license: cc-by-nc-4.0
|
5 |
---
|
6 |
## Introduction
|
7 |
+
We introduce MMEmbed, an extension of NV-Embed-v1 with multimodal retrieval capability.
|
8 |
+
MMEmbed achieves state-of-the-art results in [UniIR benchmark](https://huggingface.co/TIGER-Lab/UniIR) with 52.7 averaged score compared to 48.9 (the best results in [UnIR benchmark paper](https://eccv.ecva.net/virtual/2024/poster/863)).
|
9 |
+
Notably, MMEmbed improves NV-Embed-v1 text retrieval accuracy, from 59.36 to 60.3 on 15 retrieval tasks within Massive Text Embedding Benchmark ([MTEB benchmark](https://arxiv.org/abs/2210.07316)).
|
10 |
+
MMEmbed presents several new training strategies, including modality-aware hard negative mining to improve multimodal retrieval accuracy in UniIR, and demonstrating a continual text-to-text fine-tuning method to further enhance the accuracy of text-to-text retrieval while maintaining mulitmodal retrieval accuracy.
|
11 |
|
12 |
<!-- For more technical details, refer to our paper: [NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models](https://arxiv.org/pdf/2405.17428). -->
|
13 |
|
|
|
19 |
|
20 |
## How to use
|
21 |
|
22 |
+
Here are two examples of how to encode queries and passages using Huggingface-transformer. Please find the required package version [here](https://huggingface.co/nvidia/MMEmbed#1-required-packages). See more instructions in various retrieval scenario [here](https://huggingface.co/nvidia/MMEmbed/blob/main/instructions.json)
|
23 |
|
24 |
### Usage of Multimodal Retrieval (HuggingFace Transformers)
|
25 |
```python
|
|
|
135 |
pip install pillow
|
136 |
```
|
137 |
|
138 |
+
#### 2. Access to model nvidia/MMEmbed is restricted. You must be authenticated to access it
|
139 |
|
140 |
Use your huggingface access [token](https://huggingface.co/settings/tokens) to execute *"huggingface-cli login"*.
|