--- title: Gistillery emoji: 🏭 colorFrom: purple colorTo: gray sdk: docker app_port: 7860 --- # Dump your knowledge, let AI refine it ## Installation Create a Python environment with Python 3.10+. Install the requirements and the package: ``` sh python -m pip install -r requirements.txt python -m pip install . ``` For development, instead do: ``` sh python -m pip install -r requirements.txt python -m pip install -r requirements-dev.txt python -m pip install -e . ``` ## Starting ### Preparing environemnt Set an environemnt variable called "HF_HUB_TOKEN" with your Hugging Face token or create a `.env` file with that env var. ### Running the app Run the `start.sh` script. This may require a `chmod +x start.sh` if not already executable. ```sh ./start.sh ``` Instead, you can also run each part individually. For this, in one terminal, start the background worker: ```sh python src/gistillery/worker.py ``` In another terminal, start the web server: ```sh uvicorn src.gistillery.webservice:app --reload --port 8080 ``` For example requests, check `requests.org`. A very simple web interface is available via gradio. To start it, run: ```sh python demo.py ``` and navigate to the indicated URL (usually http://127.0.0.1:7860). ## Docker To run everything with Docker, first build the image: ```sh docker build -t gistillery:latest . ``` Next run the container: ```sh docker run -p 7860:7860 -p 8080:8080 -e GRADIO_SERVER_NAME=0.0.0.0 -v $HOME/.cache/huggingface/hub:/home/user/.cache/huggingface/hub gistillery:latest ``` Note that the Hugging Face cache folder is mounted as a docker volume to make use of potentially available local model cache instead of downloading the transformers models each time the container is started. To prevent that, remove the `-v ...` parameter. The database used for storing the results is ephemeral and will be deleted when the docker container is stopped. The backend server is also exposed directly via port 8080 to enable DB backups (see below). ### Backup To download a backup of the backend DB, visit `localhost:8080/backup`. If you wish to start the app based on a backup, set the `DB_FILE_NAME` environment variable to the name of the backup. ## Checks ### Running tests ```sh python -m pytest tests/ ``` ### Other ```sh mypy src/ black src/ && black tests/ ruff src/ && ruff tests/ ``` ## TODOs ### Tools i. Reading pdf in general i. Reading arxiv i. Generating text from youtube videos using whisper ### Deployment i. Make DB location configurable, mountable when running in docker (otherwise, it will be deleted each time the container is stopped).