import os import streamlit as st from src.st_helpers import st_setup # Test runner now runs parallel, so need to flag this to the tokenizer os.environ['TOKENIZERS_PARALLELISM'] = 'true' if st_setup("LLM Architecture Assessment", skip_login=True): st.write(""" # LLM Architecture Assessment This application is an interactive element of the LLM Architecture Assessment project prepared by [Alisdair Fraser](https://www.linkedin.com/in/alisdairfraser/) (alisdairfraser (at) gmail (dot) com), in submission for the final research project for the [Online MSc in Artificial Intelligence](https://info.online.bath.ac.uk/msai/) with the University of Bath. This application allows users to browse a synthetic set of "private data" and to interact with systems built to represent different architectural prototypes. The goal of the project is to make an assessment of the architectural patterns for deploying LLMs in conjunction with private data stores. The target audience is technology leaders, with a goal of providing key considerations for why one might choose a particular architecture or another. All the source code for this application and the associated tooling and data can be found in the [project GitHub repo on Hugging Face](https://huggingface.co/spaces/alfraser/llm-arch/tree/main). """) # Place the video centred, but surrounded as a workaround to being able to specify the size left, center, right = st.columns([2, 3, 2]) try: with center: with open('img/overview_presentation.m4v', 'rb') as f: video_bytes = f.read() st.video(video_bytes) except: st.info("Overview presentation video not available") st.write(""" ## Tools This web application serves as the management console to run different elements required to test the architectures. Specifically: - **LLM Architectures**: Around the LLM models are wrapped "architectures" which are the systems under test and being assessed. This area allows users to see those configurations and manually chat with the architecture, as opposed to directly with the model. - **Data Browser**: Underlying this architectural assessment is a synthetic "closed" data set, which has been generated offline to simulate a closed enterprise style dataset for testing purposes. This data browser element allows users to view that data directly. - **Test Runner**: This tool allows you to select a number of questions and a set of architectures. The same questions will then be sent to each of the architectures and the results logged for analysis. - **Test Reporter**: As interactions are taking place with the architectures under test, the results are being logged for analysis. This area allows users to view those log records and see some simple results. - **System Status**: This area lets the user undertake some basic system controls. It allows the test logs to be wiped clean, and also allows users to see the status of the LLM endpoints which the demo uses and pause/resume them as applicable. ## Credits - This project predominantly uses [LLama 2](https://ai.meta.com/llama/) and derivative models for language inference. Models are made available under the [Meta Llama license](https://ai.meta.com/llama/license/). - This application is built on [streamlit](https://streamlit.io). """)