Nuno Machado commited on
Commit
74a0e88
β€’
1 Parent(s): ae5a5f0

Update README

Browse files
Files changed (1) hide show
  1. README.md +30 -1
README.md CHANGED
@@ -1,14 +1,43 @@
 
 
 
 
 
 
 
 
 
 
1
  # lex-semantic-search
2
- Semantic search for Lex Fridman podcast
3
 
 
4
 
5
  ## Dataset
6
 
 
 
7
  ## Usage
8
 
 
9
  ```bash
10
  python -m venv lex-semantic-search
11
  source lex-semantic-search/bin/activate
12
  pip install -r requirements_cpu.txt # for CPU
13
  pip install -r requirements_gpu.txt # for GPU
14
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Lex Fridman Podcast Semantic Search
3
+ emoji: πŸ’‘
4
+ colorFrom: red
5
+ colorTo: yellow
6
+ sdk: gradio
7
+ sdk_version: 3.28.3
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
11
  # lex-semantic-search
 
12
 
13
+ Gradio application for performing semantic search on Lex Fridman podcast transcripts.
14
 
15
  ## Dataset
16
 
17
+ The Gradio application is pre-loaded with chunks (chunk size is 25 contiguous entries) and embeddings for dataset [nmac/lex_fridman_podcast](https://huggingface.co/datasets/nmac/lex_fridman_podcast).
18
+
19
  ## Usage
20
 
21
+ 1. Set up virtual environment with the required dependencies:
22
  ```bash
23
  python -m venv lex-semantic-search
24
  source lex-semantic-search/bin/activate
25
  pip install -r requirements_cpu.txt # for CPU
26
  pip install -r requirements_gpu.txt # for GPU
27
  ```
28
+
29
+ 2. Run the application locally using the following command:
30
+ ```bash
31
+ python app.py
32
+ ```
33
+
34
+ 3. Access the application by opening your web browser and navigating to http://localhost:7860.
35
+
36
+ 4. In the application interface, adjust the input settings according to your needs:
37
+ - **Query:** Enter a query to search for relevant podcast transcript chunks related to it.
38
+ - **Chunk Size:** Adjust the chunk size. *(Fixed to 25)*
39
+ - **Embeddings Generator:** Select the embeddings generator to use. *(Fixed to `sentence-transformers/multi-qa-mpnet-base-dot-v1`)*
40
+ - **Retriever Method:** Select the retriever method. *(Fixed to `FAISS`)*
41
+ - **Number of Chunks to Retrieve:** Set the number of chunks to retrieve.
42
+
43
+ 5. Click the "Submit" button to retrieve the chunks that match your settings and query. The results will be displayed in a table.