geekyrakshit commited on
Commit
2ab36c4
·
1 Parent(s): 3b25ef5
medrag_multi_modal/retrieval/colpali_retrieval.py CHANGED
@@ -49,7 +49,7 @@ class CalPaliRetriever(weave.Model):
49
  if metadata_dataset_name
50
  else None
51
  )
52
-
53
  def index(self, data_artifact_name: str, weave_dataset_name: str, index_name: str):
54
  """
55
  Indexes a dataset of documents and saves the index as a Weave artifact.
@@ -62,7 +62,7 @@ class CalPaliRetriever(weave.Model):
62
  If a Weave run is active, the method creates a new Weave artifact with the specified
63
  index name and type "colpali-index". It adds the local index directory to the artifact
64
  and saves it to Weave, including metadata with the provided Weave dataset name.
65
-
66
  !!! example "Indexing Data"
67
  First you need to install `Byaldi` library by Answer.ai.
68
 
@@ -84,11 +84,11 @@ class CalPaliRetriever(weave.Model):
84
  index_name="grays-anatomy",
85
  )
86
  ```
87
-
88
  ??? note "Optional Speedup using Flash Attention"
89
  If you have a GPU with Flash Attention support, you can enable it for ColPali by simply
90
  installing the `flash-attn` package.
91
-
92
  ```bash
93
  uv pip install flash-attn --no-build-isolation
94
  ```
@@ -132,7 +132,7 @@ class CalPaliRetriever(weave.Model):
132
  model from the index path within the index artifact directory. Finally, it returns
133
  an instance of the class initialized with the retrieved document retrieval model,
134
  metadata dataset name, and data artifact directory.
135
-
136
  !!! example "Retrieving Documents"
137
  First you need to install `Byaldi` library by Answer.ai.
138
 
@@ -155,11 +155,11 @@ class CalPaliRetriever(weave.Model):
155
  data_artifact_name="ml-colabs/medrag-multi-modal/grays-anatomy-images:v1",
156
  )
157
  ```
158
-
159
  ??? note "Optional Speedup using Flash Attention"
160
  If you have a GPU with Flash Attention support, you can enable it for ColPali by simply
161
  installing the `flash-attn` package.
162
-
163
  ```bash
164
  uv pip install flash-attn --no-build-isolation
165
  ```
@@ -195,7 +195,7 @@ class CalPaliRetriever(weave.Model):
195
  This function uses the document retrieval model to search for the most relevant
196
  documents based on the provided query. It returns a list of dictionaries, each
197
  containing the document image, document ID, and the relevance score.
198
-
199
  !!! example "Retrieving Documents"
200
  First you need to install `Byaldi` library by Answer.ai.
201
 
@@ -222,11 +222,11 @@ class CalPaliRetriever(weave.Model):
222
  top_k=3,
223
  )
224
  ```
225
-
226
  ??? note "Optional Speedup using Flash Attention"
227
  If you have a GPU with Flash Attention support, you can enable it for ColPali by simply
228
  installing the `flash-attn` package.
229
-
230
  ```bash
231
  uv pip install flash-attn --no-build-isolation
232
  ```
 
49
  if metadata_dataset_name
50
  else None
51
  )
52
+
53
  def index(self, data_artifact_name: str, weave_dataset_name: str, index_name: str):
54
  """
55
  Indexes a dataset of documents and saves the index as a Weave artifact.
 
62
  If a Weave run is active, the method creates a new Weave artifact with the specified
63
  index name and type "colpali-index". It adds the local index directory to the artifact
64
  and saves it to Weave, including metadata with the provided Weave dataset name.
65
+
66
  !!! example "Indexing Data"
67
  First you need to install `Byaldi` library by Answer.ai.
68
 
 
84
  index_name="grays-anatomy",
85
  )
86
  ```
87
+
88
  ??? note "Optional Speedup using Flash Attention"
89
  If you have a GPU with Flash Attention support, you can enable it for ColPali by simply
90
  installing the `flash-attn` package.
91
+
92
  ```bash
93
  uv pip install flash-attn --no-build-isolation
94
  ```
 
132
  model from the index path within the index artifact directory. Finally, it returns
133
  an instance of the class initialized with the retrieved document retrieval model,
134
  metadata dataset name, and data artifact directory.
135
+
136
  !!! example "Retrieving Documents"
137
  First you need to install `Byaldi` library by Answer.ai.
138
 
 
155
  data_artifact_name="ml-colabs/medrag-multi-modal/grays-anatomy-images:v1",
156
  )
157
  ```
158
+
159
  ??? note "Optional Speedup using Flash Attention"
160
  If you have a GPU with Flash Attention support, you can enable it for ColPali by simply
161
  installing the `flash-attn` package.
162
+
163
  ```bash
164
  uv pip install flash-attn --no-build-isolation
165
  ```
 
195
  This function uses the document retrieval model to search for the most relevant
196
  documents based on the provided query. It returns a list of dictionaries, each
197
  containing the document image, document ID, and the relevance score.
198
+
199
  !!! example "Retrieving Documents"
200
  First you need to install `Byaldi` library by Answer.ai.
201
 
 
222
  top_k=3,
223
  )
224
  ```
225
+
226
  ??? note "Optional Speedup using Flash Attention"
227
  If you have a GPU with Flash Attention support, you can enable it for ColPali by simply
228
  installing the `flash-attn` package.
229
+
230
  ```bash
231
  uv pip install flash-attn --no-build-isolation
232
  ```
medrag_multi_modal/retrieval/nv_embed_2.py CHANGED
@@ -83,11 +83,11 @@ class NVEmbed2Retriever(weave.Model):
83
  index_name="grays-anatomy-nvembed2",
84
  )
85
  ```
86
-
87
  ??? note "Optional Speedup using Flash Attention"
88
  If you have a GPU with Flash Attention support, you can enable it for NV-Embed-v2 by simply
89
  installing the `flash-attn` package.
90
-
91
  ```bash
92
  uv pip install flash-attn --no-build-isolation
93
  ```
@@ -144,11 +144,11 @@ class NVEmbed2Retriever(weave.Model):
144
  index_artifact_address="ml-colabs/medrag-multi-modal/grays-anatomy-nvembed2:v0",
145
  )
146
  ```
147
-
148
  ??? note "Optional Speedup using Flash Attention"
149
  If you have a GPU with Flash Attention support, you can enable it for NV-Embed-v2 by simply
150
  installing the `flash-attn` package.
151
-
152
  ```bash
153
  uv pip install flash-attn --no-build-isolation
154
  ```
@@ -258,11 +258,11 @@ class NVEmbed2Retriever(weave.Model):
258
  )
259
  retriever.predict(query="What are Ribosomes?")
260
  ```
261
-
262
  ??? note "Optional Speedup using Flash Attention"
263
  If you have a GPU with Flash Attention support, you can enable it for NV-Embed-v2 by simply
264
  installing the `flash-attn` package.
265
-
266
  ```bash
267
  uv pip install flash-attn --no-build-isolation
268
  ```
 
83
  index_name="grays-anatomy-nvembed2",
84
  )
85
  ```
86
+
87
  ??? note "Optional Speedup using Flash Attention"
88
  If you have a GPU with Flash Attention support, you can enable it for NV-Embed-v2 by simply
89
  installing the `flash-attn` package.
90
+
91
  ```bash
92
  uv pip install flash-attn --no-build-isolation
93
  ```
 
144
  index_artifact_address="ml-colabs/medrag-multi-modal/grays-anatomy-nvembed2:v0",
145
  )
146
  ```
147
+
148
  ??? note "Optional Speedup using Flash Attention"
149
  If you have a GPU with Flash Attention support, you can enable it for NV-Embed-v2 by simply
150
  installing the `flash-attn` package.
151
+
152
  ```bash
153
  uv pip install flash-attn --no-build-isolation
154
  ```
 
258
  )
259
  retriever.predict(query="What are Ribosomes?")
260
  ```
261
+
262
  ??? note "Optional Speedup using Flash Attention"
263
  If you have a GPU with Flash Attention support, you can enable it for NV-Embed-v2 by simply
264
  installing the `flash-attn` package.
265
+
266
  ```bash
267
  uv pip install flash-attn --no-build-isolation
268
  ```