Spaces:
Runtime error
Runtime error
File size: 5,736 Bytes
2adce68 d327bd9 2adce68 ce98eaa a377dbd 1e895ae a377dbd 1e895ae a377dbd 1e895ae a377dbd 1e895ae a377dbd 1e895ae a377dbd ce98eaa f87d5b8 ce98eaa |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 |
---
title: Llm Arch
emoji: π
colorFrom: green
colorTo: indigo
sdk: streamlit
sdk_version: 1.28.2
app_file: Home.py
pinned: false
license: cc-by-sa-4.0
---
# LLM Arch
This project is a demonstration playground for the LLM-enabled architectures built as a submission for the Online MSc in Artificial Intelligence through the University of Bath. The purpose of the project is to explore "LLM-enabled architectures" where an LLM is used in conjunction with some store of private data. The goal is to provide decision support information to technical managers on the _how_ of using LLMs with their organisational data. Specifically by comparing technical architectures and assessing the organisational implications of the technical choices.
# File Structure
<pre>
llm-arch
βββ config
β βββ architectures.json <i>(configuration for the architectures under test and displayed in the UI)</i>
βββ data
β βββ fine_tuning <i>(data and scripts related to fine-tuning LLMs)</i>
β βββ json <i>(raw json files containing the synthetic private data built for the project)</i>
β βββ sqlite
β β βββ 01_all_products_dataset.db <i>(sqlite db containing all products generated)</i>
β β βββ 02_baseline_dataset.db <i>(sqlite db containing the subset of data selected to be the baseline)</i>
β β βββ test_records.db <i>(sqlite database containing the peristed test results)</i>
β βββ vector_stores <i>(chromadb files containing the document embeddings for the RAG architectures)</i>
βββ img
βββ pages
βββ src
β βββ data_synthesis <i>(python code related to generating, selecting and loading the private dataset used for the project)</i>
β βββ training <i>(python code related to training the architectures - not used at runtime)</i>
β βββ architectures.py <i>(the core architecture pipeline code including components, and trace)</i>
β βββ common.py <i>(utilities for common functions, e.g. security token access, data type manipulations)</i>
β βββ datatypes.py <i>(object oriented representation of the test data and single point for runtime access of the product DB)</i>
β βββ st_helpers.py <i>(helpers specific to streamlit)</i>
β βββ testing.py <i>(functionality relating to running, recording and reporting on batches of tests)</i>
βββ Home.py <i>(main entry point for streamlit - first page in the streamlit app)</i>
βββ local_env.yml <i>(conda environment for running project locally)</i>
βββ README.md <i>(readme - this file)</i>
βββ requirements.txt <i>(requirements file for additional requirements in the HF spaces environment - do not use for local running of the project)</i>
</pre>
# Demonstration environment
The project is available as a demonstration running [here on Hugging Face Spaces](https://huggingface.co/spaces/alfraser/llm-arch). This should be the preferred method to interact with the project.
# Building and running the project
The project is split into two separate elements - training and inference. Both elements are within this repo. The training elements run offline and generate private data, and train models or vector stores for each of the architectures. These elements are hosted under `data_synthesis` and `training` respectively.
For training, it is recommended that you review the training scripts. Each architecture will be trained by querying a product set database and appropriately converting that into a trained architecture.
To build and run the demonstration project for inference you can follow the following steps:
## 1 - build the environment
* Build an environment to run the project. A conda environment export is provided (`local_env.yml`), including the required pip dependencies also.
* Checkout the code repository locally.
## 2 - set up hugging face endpoints
* As described in the config file `architectures.json` different architectures can be configured. Each of these will have a llama 2 based LLM at its core. These LLMs are designed to be accessed as hugging face endpoints. You should set up hugging face inference endpoints for each of the models you wish to use. The default project uses both a vanilla llama2-7b-chat-hf model and the same model fin-tuned with product related queries.
## 3 - configure streamlit secrets
* Within streamlit secrets (either a `.streamlit/secrets.toml` file in the root directory of the local repository or see the documentation for your chosen hosting solution), you need to configure 3 of 4 possible settings:
1. `hf_token` - this is the hugging face token you will need to generate from your hugging face account to access inference.
2. `hf_user` - this is the user account under which the inference endpoints are running.
3. `endpoints` - this is a comma separated list of endpoints to try and control from the System Status page of the application. This should mirror the names of the models within your user which you are planning to use. It can be left blank and the System Status page will have no functionality.
4. `running_local` - no value is required to be set, this is just a flag. Setting this will disable user logins and allow all users access to all pages.
5. `app_password` - if security is being used, login is required to access all pages except the home page. These passwords are stored in this secrets entry in simple comma separated format such as '`password(username),password2(username2)`'. The user name is only for display purposes and not required for log in.
## 4 - run the demo project
* From the root of the local code repository run `streamlit run Home.py`
|