davidheineman
commited on
Commit
•
d825967
1
Parent(s):
d72d082
update readme
Browse files- README.md +2 -6
- db_search.py +1 -2
README.md
CHANGED
@@ -65,12 +65,8 @@ http://localhost:8893/api/search?k=25&query=How to extend context windows?
|
|
65 |
To see an example of search, visit:
|
66 |
[colab.research.google.com/drive/1-b90_8YSAK17KQ6C7nqKRYbCWEXQ9FGs](https://colab.research.google.com/drive/1-b90_8YSAK17KQ6C7nqKRYbCWEXQ9FGs?usp=sharing)
|
67 |
|
68 |
-
## Notes
|
69 |
-
- It's possible to update the index without re-computing the whole dataset. Basically the IVF table is updated, but the centroids are not re-computed. This requires a large dataset to already exist (in our case it does).
|
70 |
-
- We'll need someone to manage the storage/saving of the index, so it can be updated in real-time.
|
71 |
- See:
|
72 |
- https://github.com/stanford-futuredata/ColBERT/blob/main/colbert/index_updater.py
|
73 |
- https://github.com/stanford-futuredata/ColBERT/issues/111
|
74 |
-
|
75 |
-
- We may be able to offload the centroids calculation to a vector DB (check on this)
|
76 |
-
- Should have 2 people on UI, 1 on MySQL, 1 on VectorDB, 1 on ColBERT
|
|
|
65 |
To see an example of search, visit:
|
66 |
[colab.research.google.com/drive/1-b90_8YSAK17KQ6C7nqKRYbCWEXQ9FGs](https://colab.research.google.com/drive/1-b90_8YSAK17KQ6C7nqKRYbCWEXQ9FGs?usp=sharing)
|
67 |
|
68 |
+
<!-- ## Notes
|
|
|
|
|
69 |
- See:
|
70 |
- https://github.com/stanford-futuredata/ColBERT/blob/main/colbert/index_updater.py
|
71 |
- https://github.com/stanford-futuredata/ColBERT/issues/111
|
72 |
+
-->
|
|
|
|
db_search.py
CHANGED
@@ -20,8 +20,7 @@ def complete_request(colbert_response, year):
|
|
20 |
pids_str = ', '.join(['%s'] * len(pids))
|
21 |
query = PAPER_QUERY.format(query_arg_str=pids_str, year=year)
|
22 |
|
23 |
-
print(
|
24 |
-
print(pids)
|
25 |
|
26 |
cursor.execute(query, pids)
|
27 |
results = cursor.fetchall()
|
|
|
20 |
pids_str = ', '.join(['%s'] * len(pids))
|
21 |
query = PAPER_QUERY.format(query_arg_str=pids_str, year=year)
|
22 |
|
23 |
+
print(PAPER_QUERY.format(query_arg_str=', '.join([str(p) for p in pids]), year=year))
|
|
|
24 |
|
25 |
cursor.execute(query, pids)
|
26 |
results = cursor.fetchall()
|