mg98 commited on
Commit
bd8cdb3
1 Parent(s): e2c091b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -3,4 +3,23 @@ license: mit
3
  inference:
4
  parameters:
5
  max_length: 60
 
 
 
6
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  inference:
4
  parameters:
5
  max_length: 60
6
+ tags:
7
+ - fine-tuned
8
+ - information-retrieval
9
  ---
10
+
11
+ # DSI Search on Toy Dataset
12
+
13
+ [![DOI](https://zenodo.org/badge/DOI/10.1145/3642970.3655837.svg)](https://doi.org/10.1145/3642970.3655837)
14
+
15
+ This is a simplified demonstration of the search engine presented in _De-DSI: Decentralised Differentiable Search Index_.
16
+
17
+ For this example, we fine-tuned the T5-small model on a [dataset](https://huggingface.co/tribler/dsi-search-on-toy-dataset/blob/main/dataset.csv) comprised of 526 distinct documents, including:
18
+ - URLs to YouTube videos featuring movie trailers
19
+ - Magnet links for accessing CC-licensed music
20
+ - Bitcoin wallet addresses belonging to various artists
21
+
22
+ The train data consisted solely of the respective titles of the documents (i.e., no access to ambiguous queries),
23
+ and therefore does not nearly perform to the degree we think is generally possible.
24
+
25
+ For demonstration purposes, however, this model can be tested with queries like _"spider man", "oceans 13", "sister staarlightt",_ or _"xileno bitcoin address"_ (to give some examples).