de-dsi / app.py
synctext's picture
Added Tribler section to readme
43266cb verified
raw
history blame contribute delete
No virus
7.47 kB
import gradio as gr
from transformers import pipeline
model_pipeline = pipeline("text2text-generation", model="tribler/dsi-search-on-toy-dataset")
def process_query(query):
results = model_pipeline(query, max_length=60)
result_text = results[0]['generated_text'].strip()
if result_text.startswith("http"):
youtube_id = result_text.split('watch?v=')[-1]
iframe = f'<iframe width="560" height="315" src="https://www.youtube.com/embed/{youtube_id}" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>'
return gr.HTML(iframe)
elif result_text.startswith("magnet"):
return gr.HTML(f'<a href="{result_text}" target="_blank">{result_text}</a>')
else:
bitcoin_logo_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Bitcoin.svg/800px-Bitcoin.svg.png"
return gr.Textbox(f'<div style="display:flex;align-items:center;"><img src="{bitcoin_logo_url}" alt="Bitcoin Logo" style="width:20px;height:20px;margin-right:5px;"><span>{result_text}</span></div>')
interface = gr.Interface(fn=process_query,
inputs=gr.Textbox(label="Query"),
outputs="html",
title="Search Interface",
submit_btn="Find",
description="""
### Search for movie trailers, music torrents, and bitcoin wallet addresses!
This toy example knows about 500 URLs after merely a few hours of training on a single GPU.
([View dataset](https://huggingface.co/tribler/dsi-search-on-toy-dataset/blob/main/dataset.csv), read [scientific article](https://arxiv.org/pdf/2404.12237.pdf) from EuroMLSys, [model](https://huggingface.co/tribler/dsi-search-on-toy-dataset), and [all code](https://github.com/Tribler/De-DSI)).
""",
article="""
## De-DSI
De-DSI is a proof-of-principle of fully decentralised search engines.
We show a possible approach to connect millions of even billions of devices to form a decentralised search engine. This represents hopefully a step towards a "[global brain](https://dl.acm.org/doi/pdf/10.1145/2160718.2160731)" for humanity.
Generative AI is increasingly influencing fields such as content discovery, relevance ranking, and financial transactions, showcasing its potential to disrupt various industries.
The novel end-to-end generative architectures could pave the way for fully decentralised alternatives in social media, the movie industry, search engines, and financial sectors—mirroring the decentralization levels of Bitcoin and BitTorrent.
This shift could significantly empower ordinary Internet users.
Explore the scientific foundation of this transformation in our paper presented at EuroMLSys 2024.
The paper is available [here](https://huggingface.co/papers/2404.12237).
We invite you to contribute to and engage with our community at the International Workshop on [Distributed Infrastructure for Common Good](https://dicg-workshop.github.io/) (DICG).
### Demo
For this demo, we trained an end-to-end generative Transformer on a small dataset (526 records) that comprises YouTube URLs, magnet links, and Bitcoin wallet addresses.
Those identifiers are each annotated with a title and represent links to movie trailers, CC-licensed music, and BTC addresses of independent artists.
Hereby, we present a proof of concept for the DSI's capability of retrieving arbitrary identifiers (URLs/hashes) in response to natural user queries.
The model is available under a permissive license and can be accessed [here](https://huggingface.co/tribler/dsi-search-on-toy-dataset).
### Decentralisation background
Why is decentralisation of AI a milestone? The Internet itself is conceived in Dec 1960 with the report ["is decentralized communication possible?"](https://doi.org/10.7249/RM2632). A fully decentralised form of money called Bitcoin disrupted the highly regulated financial industry. Bittorrent disrupted the monopolies around broadcasting by making it fully decentralised.
The elements that have enabled humanity to shape the world are not strength, not speed, but intelligence, money, and collaboration.
Our Tribler lab is focussed on advancing these topics and ensure they benefit ordinary citizens.
Our [entire research portfolio](https://scholar.google.com/citations?hl=en&user=pprQKjUAAAAJ&view_op=list_works&sortby=pubdate) is driven by idealism. We aim to remove power from companies, governments, and AI in order to shift all this power to self-sovereign citizens.
For instance, our "[unstoppable DAO](https://dl.acm.org/doi/pdf/10.1145/3565383.3566112)" technology creates a limited form of collective money with democratic control. We pioneered [decentralised trust](https://arxiv.org/pdf/2207.09950) with [deployment](https://research.tudelft.nl/files/89353583/1_s2.0_S1389128621001705_main.pdf). Our educational master program teaches student to engineer [collective decision](https://github.com/Tribler/tribler/issues/7691) mechanisms. The [goal of the Tribler lab](https://github.com/Tribler/tribler/issues/7064) is to prototype the first global brain by 2040.
Before 2000 we worked with [visionary collaborators](http://web.archive.org/web/20020618081554/http://www.freeamp.org/pipermail/mm/2000-December/000003.html) on our first [deployments](http://www.usenix.org/publications/library/proceedings/usenix2000/freenix/full_papers/pouwelse/pouwelse.pdf) and communities with democratic control of information (pre-wikipedia era).
### Tribler
![image/svg](https://img.shields.io/github/issues-closed/tribler/tribler.svg?style=flat)
Tribler is the name of our Peer-to-Peer Bittorrent search engine and download client. The "Tribler Lab" is the research team at Delft University of Technology developing this open source client since April 2005.
Across the years we received 2.3 million unique downloads of Tribler. Over 100+ master students and [267 software](https://github.com/orgs/Tribler/people) developers have contributed code to Tribler.
We also started supporting mobile-to-mobile networking on Android and real-time machine learning using K-means:
![image/gif](https://huggingface.co/spaces/tribler/de-dsi/resolve/1a8c77245f4905b7594cd6bddbf2e06bd77902f8/Decentralised_AI__superapp_Youtube_search.gif)
The [demo APK](https://github.com/Tribler/tribler/issues/7254#issuecomment-2074490469) is available and can play Youtube videos.
### Disclamer: demo of work-in-progress
Disclaimer. This project represents both a groundbreaking advance and a preliminary exploration into decentralised systems.
Fuzzy search or trivial lookup does not need the super-heavy LLM approach. We are painfully aware of that. Support for non-trivial queries is still simply lacking.
As a preliminary model, the project showcases a toy example rather than the full potential of its ultimate capabilities.
It serves as a proof of concept that invites further development and imagination. AI can be as decentral as Bitcoin and Bittorrent, that's all.
""",
examples=[["spider man"], ["oceans 13"], ["sister starlight"], ["bitcoin address of xileno"]],
concurrency_limit=50)
if __name__ == "__main__":
interface.launch()