llm-arch / Home.py
alfraser's picture
Added environment variable to explicitly flag to the tokenizers that we are doing multi-threading and to prevent a bunch of warnings arising
dd89a23
raw
history blame
3.71 kB
import os
import streamlit as st
from src.st_helpers import st_setup
# Test runner now runs parallel, so need to flag this to the tokenizer
os.environ['TOKENIZERS_PARALLELISM'] = 'true'
if st_setup("LLM Architecture Assessment", skip_login=True):
st.write("""
# LLM Architecture Assessment
This application is an interactive element of the LLM Architecture Assessment project prepared by [Alisdair Fraser](https://www.linkedin.com/in/alisdairfraser/) (alisdairfraser (at) gmail (dot) com), in submission for the final research project for the [Online MSc in Artificial Intelligence](https://info.online.bath.ac.uk/msai/) with the University of Bath. This application allows users to browse a synthetic set of "private data" and to interact with systems built to represent different architectural prototypes.
The goal of the project is to make an assessment of the architectural patterns for deploying LLMs in conjunction with private data stores. The target audience is IT management, with a goal of providing key considerations for why one might choose a particular architecture or another.
All the source code for this application and the associated tooling and data can be found the [project GitHub repo on Hugging Face](https://huggingface.co/spaces/alfraser/llm-arch/tree/main).
## Tools
This web application serves as the management console to run different elements required to test the architectures. Specifically:
- **LLM Architectures**: Around the LLM models are wrapped "architectures" which are the systems under test and being assessed. This area allows users to see those configurations and manually chat with the architecture, as opposed to directly with the model.
- **Data Browser**: Underlying this architectural assessment is a synthetic "closed" data set, which has been generated offline to simulate a closed enterprise style dataset for testing purposes. This data browser element allows users to view that data directly.
- **Test Runner**: This tool allows you to select a number of questions and a set of architectures. The same questions will then be sent to each of the architectures and the results logged for analysis.
- **Test Reporter**: As interactions are taking place with the architectures under test, the results are being logged for analysis. This area allows users to view those log records and see some simple results.
- **System Status**: This area lets the user undertake some basic system controls. It allows the test logs to be wiped clean, and also allows users to see the status of the LLM endpoints which the demo uses and pause/resume them as applicable.
## Credits
- This project predominantly uses [LLama 2](https://ai.meta.com/llama/) and derivative models for language inference. Models are made available under the [Meta Llama license](https://ai.meta.com/llama/license/).
- This application is built on [streamlit](https://streamlit.io).
""")
# Display access message if security is not disabled
running_local = 'running_local'
if running_local not in st.secrets:
st.write("""
## Access
Some elements of this application are password protected. If you would like access to the project please contact me either via LinkedIn or the email address above, explaining why you want access and for how long and I will set that up for you. Thanks for your understanding :-)
""")
st.info("**NOTE:** The inference api endpoints used in the demonstrations of this application are scheduled to run Monday-Friday 09:00-17:00 UK hours to manage costs. Please contact me if you would like them running outside these hours.")