Spaces:
Runtime error
Runtime error
title: Lex Fridman Podcast Semantic Search | |
emoji: π‘ | |
colorFrom: red | |
colorTo: yellow | |
sdk: gradio | |
sdk_version: 3.28.3 | |
app_file: app.py | |
pinned: false | |
# lex-semantic-search | |
Gradio application for performing semantic search on Lex Fridman podcast transcripts. | |
## Dataset | |
The Gradio application is pre-loaded with chunks (chunk size is 25 contiguous entries) and embeddings for dataset [nmac/lex_fridman_podcast](https://huggingface.co/datasets/nmac/lex_fridman_podcast). | |
## Usage | |
1. Set up virtual environment with the required dependencies: | |
```bash | |
python -m venv lex-semantic-search | |
source lex-semantic-search/bin/activate | |
pip install -r requirements.txt # for GPU | |
pip install -r requirements_cpu.txt # for CPU | |
``` | |
2. Run the application locally using the following command: | |
```bash | |
python app.py | |
``` | |
3. Access the application by opening your web browser and navigating to http://localhost:7860. | |
4. In the application interface, adjust the input settings according to your needs: | |
- **Query:** Enter a query to search for relevant podcast transcript chunks related to it. | |
- **Chunk Size:** Adjust the chunk size. *(Fixed to 25)* | |
- **Embeddings Generator:** Select the embeddings generator to use. *(Fixed to `sentence-transformers/multi-qa-mpnet-base-dot-v1`)* | |
- **Retriever Method:** Select the retriever method. *(Fixed to `FAISS`)* | |
- **Number of Chunks to Retrieve:** Set the number of chunks to retrieve. | |
5. Click the "Submit" button to retrieve the chunks that match your settings and query. The results will be displayed in a table. | |