File size: 2,001 Bytes
edad1fd
 
 
 
 
 
 
a69d4fe
edad1fd
 
 
 
 
583f70c
 
0fbacc4
 
 
 
 
 
5a788a7
 
 
 
0fbacc4
 
 
96538e7
 
 
 
 
 
 
 
b72fb4c
 
 
5cad0cc
 
bfe4194
 
b72fb4c
5cad0cc
b72fb4c
5cad0cc
b72fb4c
bfe4194
5cad0cc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
title: Raccoon
emoji: 🦝
colorFrom: blue
colorTo: indigo
sdk: streamlit
sdk_version: 1.2.0
python_version: 3.9
app_file: app.py
pinned: false
license: mit
---

# Raccoon

## Installation

It is recommend to use virtual environment using [`venv`](https://docs.python.org/3/library/venv.html).

The fol


 - If using Apple Silicon install rust `curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh` and `brew install cmake`
 - Create the virtual envoirnment: `python3 -m venv .venv`
 - Activate the virtual envoirnment: `source .venv/bin/activate`
   - To deactive the virtual envoirnment run `deactivate` within the virtual envoirnment.
 - Install the required packages: `.venv/bin/pip install -r requirements.txt`
 - `.venv/bin/pip install -e .`
 - [Create a custom search engine in Google](https://programmablesearchengine.google.com/controlpanel/all).
 - Create a API for the custom search engine.
 - Add the custom search engine key and PI key to `.streamlit/secrets.toml`.
```toml
google_search_api_key = "api-key"
google_search_engine_id = "search-engine-id"
```
 - To start the interface: `streamlit run app.py`

### Todo
- [ ] Improve fetched content.
  - [x] Fix issue of duplicate content extracted by beautifulsoup.
  - [x] Exclude code from content
  - [x] Find sentences that contain the search keywords.
  - [ ] Find sentences that contain the search keywords taking into account different spellings health care vs healthcare.
  - [ ] Get some content from every search result.
  - [ ] Div's with text & tags. Extract text from tags and then decompose the tags. Keep order of content and no duplicates.
- [ ] Summarization requires truncation. Find solution where not needed.
- [ ] Support German content with language switcher.
- [ ] Improve queries to include more keywords (Expand abrivations & define context)
- [ ] Control the number of results from the UI.
- [ ] Control summary length via settings: https://docs.streamlit.io/library/advanced-features/session-state