Spaces:
Running
Running
Commit
·
2ab36c4
1
Parent(s):
3b25ef5
fix: lint
Browse files
medrag_multi_modal/retrieval/colpali_retrieval.py
CHANGED
@@ -49,7 +49,7 @@ class CalPaliRetriever(weave.Model):
|
|
49 |
if metadata_dataset_name
|
50 |
else None
|
51 |
)
|
52 |
-
|
53 |
def index(self, data_artifact_name: str, weave_dataset_name: str, index_name: str):
|
54 |
"""
|
55 |
Indexes a dataset of documents and saves the index as a Weave artifact.
|
@@ -62,7 +62,7 @@ class CalPaliRetriever(weave.Model):
|
|
62 |
If a Weave run is active, the method creates a new Weave artifact with the specified
|
63 |
index name and type "colpali-index". It adds the local index directory to the artifact
|
64 |
and saves it to Weave, including metadata with the provided Weave dataset name.
|
65 |
-
|
66 |
!!! example "Indexing Data"
|
67 |
First you need to install `Byaldi` library by Answer.ai.
|
68 |
|
@@ -84,11 +84,11 @@ class CalPaliRetriever(weave.Model):
|
|
84 |
index_name="grays-anatomy",
|
85 |
)
|
86 |
```
|
87 |
-
|
88 |
??? note "Optional Speedup using Flash Attention"
|
89 |
If you have a GPU with Flash Attention support, you can enable it for ColPali by simply
|
90 |
installing the `flash-attn` package.
|
91 |
-
|
92 |
```bash
|
93 |
uv pip install flash-attn --no-build-isolation
|
94 |
```
|
@@ -132,7 +132,7 @@ class CalPaliRetriever(weave.Model):
|
|
132 |
model from the index path within the index artifact directory. Finally, it returns
|
133 |
an instance of the class initialized with the retrieved document retrieval model,
|
134 |
metadata dataset name, and data artifact directory.
|
135 |
-
|
136 |
!!! example "Retrieving Documents"
|
137 |
First you need to install `Byaldi` library by Answer.ai.
|
138 |
|
@@ -155,11 +155,11 @@ class CalPaliRetriever(weave.Model):
|
|
155 |
data_artifact_name="ml-colabs/medrag-multi-modal/grays-anatomy-images:v1",
|
156 |
)
|
157 |
```
|
158 |
-
|
159 |
??? note "Optional Speedup using Flash Attention"
|
160 |
If you have a GPU with Flash Attention support, you can enable it for ColPali by simply
|
161 |
installing the `flash-attn` package.
|
162 |
-
|
163 |
```bash
|
164 |
uv pip install flash-attn --no-build-isolation
|
165 |
```
|
@@ -195,7 +195,7 @@ class CalPaliRetriever(weave.Model):
|
|
195 |
This function uses the document retrieval model to search for the most relevant
|
196 |
documents based on the provided query. It returns a list of dictionaries, each
|
197 |
containing the document image, document ID, and the relevance score.
|
198 |
-
|
199 |
!!! example "Retrieving Documents"
|
200 |
First you need to install `Byaldi` library by Answer.ai.
|
201 |
|
@@ -222,11 +222,11 @@ class CalPaliRetriever(weave.Model):
|
|
222 |
top_k=3,
|
223 |
)
|
224 |
```
|
225 |
-
|
226 |
??? note "Optional Speedup using Flash Attention"
|
227 |
If you have a GPU with Flash Attention support, you can enable it for ColPali by simply
|
228 |
installing the `flash-attn` package.
|
229 |
-
|
230 |
```bash
|
231 |
uv pip install flash-attn --no-build-isolation
|
232 |
```
|
|
|
49 |
if metadata_dataset_name
|
50 |
else None
|
51 |
)
|
52 |
+
|
53 |
def index(self, data_artifact_name: str, weave_dataset_name: str, index_name: str):
|
54 |
"""
|
55 |
Indexes a dataset of documents and saves the index as a Weave artifact.
|
|
|
62 |
If a Weave run is active, the method creates a new Weave artifact with the specified
|
63 |
index name and type "colpali-index". It adds the local index directory to the artifact
|
64 |
and saves it to Weave, including metadata with the provided Weave dataset name.
|
65 |
+
|
66 |
!!! example "Indexing Data"
|
67 |
First you need to install `Byaldi` library by Answer.ai.
|
68 |
|
|
|
84 |
index_name="grays-anatomy",
|
85 |
)
|
86 |
```
|
87 |
+
|
88 |
??? note "Optional Speedup using Flash Attention"
|
89 |
If you have a GPU with Flash Attention support, you can enable it for ColPali by simply
|
90 |
installing the `flash-attn` package.
|
91 |
+
|
92 |
```bash
|
93 |
uv pip install flash-attn --no-build-isolation
|
94 |
```
|
|
|
132 |
model from the index path within the index artifact directory. Finally, it returns
|
133 |
an instance of the class initialized with the retrieved document retrieval model,
|
134 |
metadata dataset name, and data artifact directory.
|
135 |
+
|
136 |
!!! example "Retrieving Documents"
|
137 |
First you need to install `Byaldi` library by Answer.ai.
|
138 |
|
|
|
155 |
data_artifact_name="ml-colabs/medrag-multi-modal/grays-anatomy-images:v1",
|
156 |
)
|
157 |
```
|
158 |
+
|
159 |
??? note "Optional Speedup using Flash Attention"
|
160 |
If you have a GPU with Flash Attention support, you can enable it for ColPali by simply
|
161 |
installing the `flash-attn` package.
|
162 |
+
|
163 |
```bash
|
164 |
uv pip install flash-attn --no-build-isolation
|
165 |
```
|
|
|
195 |
This function uses the document retrieval model to search for the most relevant
|
196 |
documents based on the provided query. It returns a list of dictionaries, each
|
197 |
containing the document image, document ID, and the relevance score.
|
198 |
+
|
199 |
!!! example "Retrieving Documents"
|
200 |
First you need to install `Byaldi` library by Answer.ai.
|
201 |
|
|
|
222 |
top_k=3,
|
223 |
)
|
224 |
```
|
225 |
+
|
226 |
??? note "Optional Speedup using Flash Attention"
|
227 |
If you have a GPU with Flash Attention support, you can enable it for ColPali by simply
|
228 |
installing the `flash-attn` package.
|
229 |
+
|
230 |
```bash
|
231 |
uv pip install flash-attn --no-build-isolation
|
232 |
```
|
medrag_multi_modal/retrieval/nv_embed_2.py
CHANGED
@@ -83,11 +83,11 @@ class NVEmbed2Retriever(weave.Model):
|
|
83 |
index_name="grays-anatomy-nvembed2",
|
84 |
)
|
85 |
```
|
86 |
-
|
87 |
??? note "Optional Speedup using Flash Attention"
|
88 |
If you have a GPU with Flash Attention support, you can enable it for NV-Embed-v2 by simply
|
89 |
installing the `flash-attn` package.
|
90 |
-
|
91 |
```bash
|
92 |
uv pip install flash-attn --no-build-isolation
|
93 |
```
|
@@ -144,11 +144,11 @@ class NVEmbed2Retriever(weave.Model):
|
|
144 |
index_artifact_address="ml-colabs/medrag-multi-modal/grays-anatomy-nvembed2:v0",
|
145 |
)
|
146 |
```
|
147 |
-
|
148 |
??? note "Optional Speedup using Flash Attention"
|
149 |
If you have a GPU with Flash Attention support, you can enable it for NV-Embed-v2 by simply
|
150 |
installing the `flash-attn` package.
|
151 |
-
|
152 |
```bash
|
153 |
uv pip install flash-attn --no-build-isolation
|
154 |
```
|
@@ -258,11 +258,11 @@ class NVEmbed2Retriever(weave.Model):
|
|
258 |
)
|
259 |
retriever.predict(query="What are Ribosomes?")
|
260 |
```
|
261 |
-
|
262 |
??? note "Optional Speedup using Flash Attention"
|
263 |
If you have a GPU with Flash Attention support, you can enable it for NV-Embed-v2 by simply
|
264 |
installing the `flash-attn` package.
|
265 |
-
|
266 |
```bash
|
267 |
uv pip install flash-attn --no-build-isolation
|
268 |
```
|
|
|
83 |
index_name="grays-anatomy-nvembed2",
|
84 |
)
|
85 |
```
|
86 |
+
|
87 |
??? note "Optional Speedup using Flash Attention"
|
88 |
If you have a GPU with Flash Attention support, you can enable it for NV-Embed-v2 by simply
|
89 |
installing the `flash-attn` package.
|
90 |
+
|
91 |
```bash
|
92 |
uv pip install flash-attn --no-build-isolation
|
93 |
```
|
|
|
144 |
index_artifact_address="ml-colabs/medrag-multi-modal/grays-anatomy-nvembed2:v0",
|
145 |
)
|
146 |
```
|
147 |
+
|
148 |
??? note "Optional Speedup using Flash Attention"
|
149 |
If you have a GPU with Flash Attention support, you can enable it for NV-Embed-v2 by simply
|
150 |
installing the `flash-attn` package.
|
151 |
+
|
152 |
```bash
|
153 |
uv pip install flash-attn --no-build-isolation
|
154 |
```
|
|
|
258 |
)
|
259 |
retriever.predict(query="What are Ribosomes?")
|
260 |
```
|
261 |
+
|
262 |
??? note "Optional Speedup using Flash Attention"
|
263 |
If you have a GPU with Flash Attention support, you can enable it for NV-Embed-v2 by simply
|
264 |
installing the `flash-attn` package.
|
265 |
+
|
266 |
```bash
|
267 |
uv pip install flash-attn --no-build-isolation
|
268 |
```
|