Spaces:

fireworks-ai
/

ugly-holiday-card-generator

Running

App Files Files Community

Mikiko Bazeley commited on Oct 15, 2024

Commit

eeb7bb5

1 Parent(s): 3be47f7

Copied over LLM-as-judge starter project

Browse files

Files changed (21) hide show

.env.template +6 -0
README.md +125 -14
home.py +85 -0
img/ash.png +0 -0
img/bulbasaur.png +0 -0
img/charmander.png +0 -0
img/fireworksai_logo.png +0 -0
img/home_page_1.png +0 -0
img/home_page_2.png +0 -0
img/page_1_a.png +0 -0
img/page_1_b.png +0 -0
img/page_1_c.png +0 -0
img/page_1_empty.png +0 -0
img/page_2_a.png +0 -0
img/page_2_b.png +0 -0
img/page_2_c.png +0 -0
img/page_2_empty.png +0 -0
img/squirtel.png +0 -0
pages/1_Comparing_LLMs.py +185 -0
pages/2_Parameter_Exploration_for_LLMs.py +293 -0
requirements.txt +246 -0

.env.template ADDED Viewed

	@@ -0,0 +1,6 @@

+# Fireworks AI API key
+FIREWORKS_API_KEY=your_fireworks_api_key_here
+# Debug mode
+DEBUG=False

README.md CHANGED Viewed

@@ -1,14 +1,125 @@
----
-title: Ugly Holiday Card Generator
-emoji: 🦀
-colorFrom: gray
-colorTo: gray
-sdk: streamlit
-sdk_version: 1.39.0
-app_file: app.py
-pinned: false
-license: mit
-short_description: Generate an ugly hold
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+## Project: Fireworks Model Comparison App
+### Overview
+The **Fireworks Model Comparison App** is an interactive tool built using **Streamlit** that allows users to compare various Large Language Models (LLMs) hosted on **Fireworks AI**. Users can adjust key model parameters, provide custom prompts, and generate model outputs to compare their behavior and responses. Additionally, an LLM-as-a-Judge feature is available to evaluate the generated outputs and provide feedback on their quality.
+### Objectives
+- **Compare Models**: Select different models from the Fireworks platform and compare their outputs based on a shared prompt.
+- **Modify Parameters**: Fine-tune parameters such as **Max Tokens**, **Temperature**, **Top-p**, and **Top-k** to observe how they influence model behavior.
+- **Evaluate Using LLM-as-a-Judge**: After generating responses, use a separate model to act as a judge and evaluate the outputs from the selected models.
+![Home Page Screenshot](img/home_page_1.png)
+![Home Page Screenshot](img/home_page_2.png)
+### Features
+- **Streamlit UI**: A simple and intuitive interface where users can select models, input prompts, and adjust model parameters.
+- **LLM Comparison**: Select up to three different models, run a query with the same prompt, and view side-by-side responses.
+- **Parameter Exploration**: Explore and modify different parameters such as Max Tokens, Temperature, Top-p, and more to see how they affect the model's response.
+- **LLM-as-a-Judge**: Let another LLM compare the generated responses from the models and provide a comparison.
+### App Structure
+The app consists of two main pages:
+1. **Comparing LLMs**:
+   - Compare the outputs of three selected LLMs from Fireworks AI by providing a prompt.
+   - View the responses side-by-side for easy comparison.
+   - A selected LLM acts as a judge to evaluate the generated responses.
+![Home Page Screenshot](img/page_1_empty.png)
+![Home Page Screenshot](img/page_1_a.png)
+![Home Page Screenshot](img/page_1_b.png)
+![Home Page Screenshot](img/page_1_c.png)
+2. **Parameter Exploration**:
+   - Modify various parameters for the LLMs (e.g., Max Tokens, Temperature, Top-p) and observe how they affect the outputs.
+   - Compare three different outputs generated with varying parameter configurations.
+   - Use LLM-as-a-Judge to provide a final evaluation of the outputs.
+![Home Page Screenshot](img/page_2_empty.png)
+![Home Page Screenshot](img/page_2_a.png)
+![Home Page Screenshot](img/page_2_b.png)
+![Home Page Screenshot](img/page_2_c.png)
+### Setup and Installation
+#### Prerequisites
+- **Python 3.x** installed on your machine.
+- A **Fireworks AI** API key, which you can obtain by signing up at [Fireworks AI](https://fireworks.ai).
+- Install **Streamlit** and the **Fireworks Python Client**.
+#### Step-by-Step Setup
+##### 1. Clone the Repository:
+   First, clone the repository from GitHub:
+   ```bash
+   git clone https://github.com/fw-ai/examples.git
+   ```
+##### 2. Navigate to the Specific Project Sub-directory:
+   After cloning the repository, navigate to the `project_llm-as-a-judge-streamlit-dashboard` sub-directory:
+   ```bash
+   cd learn/inference/project_llm-as-a-judge-streamlit-dashboard
+   ```
+##### 3. Set up a Virtual Environment (Optional but Recommended):
+   Create and activate a Python virtual environment:
+   ```bash
+   python3 -m venv venv
+   source venv/bin/activate  # On macOS/Linux
+   .\venv\Scripts\activate  # On Windows
+   ```
+##### 4. Install Required Dependencies:
+   Install the necessary Python dependencies using `pip3`:
+   ```bash
+   pip3 install -r requirements.txt
+   ```
+##### 5. Configure the `.env` File:
+   Copy the `.env.template` file and rename it to `.env` in the same project directory:
+   ```bash
+   mkdir env/
+   cp .env.template env/.env
+   ```
+   Open the `.env` file and add your **FIREWORKS_API_KEY**:
+   ```bash
+   FIREWORKS_API_KEY=<your_fireworks_api_key>
+   ```
+##### 6. Run the Streamlit App:
+   Finally, run the Streamlit app:
+   ```bash
+   streamlit run home.py
+   ```
+##### 7. **Explore the app**:
+    - Open the app in your browser via the URL provided by Streamlit (typically `http://localhost:8501`).
+    - Navigate between the pages to compare models and adjust parameters.
+### Example Prompts
+Here are some example prompts you can try in the app:
+- **Prompt 1**: "Describe the future of AI in 500 words."
+- **Prompt 2**: "Write a short story about a time traveler who visits ancient Rome."
+- **Prompt 3**: "Explain quantum computing in simple terms."
+- **Prompt 4**: "Generate a recipe for a healthy vegan dinner."
+### Fireworks API Documentation
+To learn more about how to query models and interact with the Fireworks API, visit the [Fireworks API Documentation](https://docs.fireworks.ai/api-reference/post-chatcompletions).
+### Contributing
+We welcome contributions to improve this app! To contribute, fork the repository, make your changes, and submit a pull request.
+### License
+This project is licensed under the MIT License.

home.py ADDED Viewed

	@@ -0,0 +1,85 @@

+import streamlit as st
+from PIL import Image
+# Load images
+logo_image = Image.open("img/fireworksai_logo.png")
+bulbasaur_image = Image.open("img/bulbasaur.png")
+charmander_image = Image.open("img/charmander.png")
+squirtel_image = Image.open("img/squirtel.png")
+ash_image = Image.open("img/ash.png")
+# Set page configuration
+st.set_page_config(page_title="Fireworks Model Comparison App", page_icon="🎇")
+# Fireworks Logo at the top
+st.image(logo_image)
+# Home page title and description
+st.title("Fireworks Model Comparison App")
+# Introduction with Pokémon image (Bulbasaur)
+st.markdown("""
+### Welcome to the Fireworks Model Comparison App!""")
+st.image(ash_image, width=100)
+st.markdown(""" This app allows you to interact with and compare various Large Language Models (LLMs) hosted on **Fireworks AI**. You can select from a range of models, adjust key model parameters, and run comparisons between their outputs. The app also enables you to evaluate results using an **LLM-as-a-judge** to provide an unbiased comparison of responses.""")
+# API Documentation Link
+st.markdown("""
+[Explore Fireworks API Documentation](https://docs.fireworks.ai/api-reference/post-chatcompletions)
+""")
+# Objectives of the App with Pokémon image (Charmander)
+st.markdown("""
+---
+### Objectives of the App:
+- **Compare Different Models**: Select models from Fireworks AI’s hosted collection and compare their outputs.
+- **Modify Parameters**: Adjust settings like **Max Tokens**, **Temperature**, and **Sampling** methods to explore how different configurations affect outputs.
+- **Evaluate Using LLM-as-a-Judge**: Generate responses and use another LLM to evaluate and provide a comparison.
+- **Simple Interface**: The app uses **Streamlit**, making it easy to use, even for those without coding experience.
+""")
+# How to use the app with Pokémon image (Squirtle)
+st.image(squirtel_image, width=100)
+st.markdown("""
+---
+### How to Use the App:
+1. **Select a Model**: Use the dropdown menus to choose models for comparison.
+2. **Provide a Prompt**: Enter a prompt that the models will use to generate a response.
+3. **Adjust Parameters**: Fine-tune the settings for each model to explore how different configurations affect the results.
+4. **Generate and Compare**: View the responses from multiple models side-by-side.
+5. **Evaluate with LLM-as-a-Judge**: Use another model to compare and judge the outputs.
+""")
+# Explanation of Other Pages with Pokémon image (Ash)
+st.image(bulbasaur_image, width=100)
+st.markdown("""
+---
+### App Sections:
+This Streamlit app consists of two key pages that help you interact with the Fireworks AI platform and perform model comparisons.
+- **Page 1: Comparing LLMs**
+  - On this page, you can compare the outputs of three selected LLMs from Fireworks AI by providing a single prompt.
+  - The outputs are displayed side-by-side for easy comparison, and a selected LLM can act as a judge to evaluate the responses.
+- **Page 2: Parameter Exploration for LLMs**
+  - This page allows you to adjust various parameters like **Max Tokens**, **Temperature**, and **Sampling Methods** for LLMs.
+  - You can provide a prompt and see how different parameter configurations affect the output for each model.
+  - The LLM-as-a-Judge is also used to compare and evaluate the generated responses.
+""")
+st.image(charmander_image, width=100)
+# Background Information about Fireworks Models with Pokémon image (Bulbasaur again for symmetry)
+st.markdown("""
+---
+### Fireworks AI Models:
+Fireworks AI provides access to a variety of Large Language Models (LLMs) that you can query and experiment with, including:
+- **Text Models**: These models are designed for tasks such as text generation, completion, and Q&A.
+- **Model Parameters**: By adjusting parameters such as temperature, top-p, and top-k, you can influence the behavior of the models and the creativity or focus of their outputs.
+For more information, check out the [Fireworks API Documentation](https://docs.fireworks.ai/api-reference/post-chatcompletions) and learn how to query different models using Fireworks' Python Client.
+""")

img/ash.png ADDED Viewed

img/bulbasaur.png ADDED Viewed

img/charmander.png ADDED Viewed

img/fireworksai_logo.png ADDED Viewed

img/home_page_1.png ADDED Viewed

img/home_page_2.png ADDED Viewed

img/page_1_a.png ADDED Viewed

img/page_1_b.png ADDED Viewed

img/page_1_c.png ADDED Viewed

img/page_1_empty.png ADDED Viewed

img/page_2_a.png ADDED Viewed

img/page_2_b.png ADDED Viewed

img/page_2_c.png ADDED Viewed

img/page_2_empty.png ADDED Viewed

img/squirtel.png ADDED Viewed

pages/1_Comparing_LLMs.py ADDED Viewed

	@@ -0,0 +1,185 @@

+from dotenv import load_dotenv
+import os
+from PIL import Image
+import streamlit as st
+import fireworks.client
+st.set_page_config(page_title="LLM Comparison Tool", page_icon="🎇")
+st.title("LLM-as-a-judge: Comparing LLMs using Fireworks")
+st.write("A light introduction to how easy it is to swap LLMs and how to use the Fireworks Python client")
+# Clear the cache before starting
+st.cache_data.clear()
+# Specify the path to the .env file in the env/ directory
+dotenv_path = os.path.join(os.path.dirname(__file__), '..', 'env', '.env')
+# Load the .env file from the specified path
+load_dotenv(dotenv_path)
+# Get the Fireworks API key from the environment variable
+fireworks_api_key = os.getenv("FIREWORKS_API_KEY")
+if not fireworks_api_key:
+    raise ValueError("No API key found in the .env file. Please add your FIREWORKS_API_KEY to the .env file.")
+# Load the image
+logo_image = Image.open("img/fireworksai_logo.png")
+ash_image = Image.open("img/ash.png")
+bulbasaur_image = Image.open("img/bulbasaur.png")
+squirtel_image = Image.open("img/squirtel.png")
+charmander_image = Image.open("img/charmander.png")
+st.divider()
+# Streamlit app
+st.subheader("Fireworks Playground")
+st.write("Fireworks AI is a platform that offers serverless and scalable AI models.")
+st.write("👉 Learn more here: [Fireworks Serverless Models](https://fireworks.ai/models?show=Serverless)")
+st.divider()
+# Sidebar for selecting models
+with st.sidebar:
+    st.image(logo_image)
+    st.write("Select three models to compare their outputs:")
+    st.image(bulbasaur_image, width=80)
+    option_1 = st.selectbox("Select Model 1", [
+        "Text: Meta Llama 3.1 Instruct - 70B",
+        "Text: Meta Llama 3.1 Instruct - 8B",
+        "Text: Meta Llama 3.2 Instruct - 3B",
+        "Text: Gemma 2 Instruct - 9B",
+        "Text: Mixtral MoE Instruct - 8x22B",
+        "Text: Mixtral MoE Instruct - 8x7B",
+        "Text: MythoMax L2 - 13B"
+    ], index=2)  # Default to Meta Llama 3.2 Instruct - 3B
+    st.image(charmander_image, width=80)
+    option_2 = st.selectbox("Select Model 2", [
+        "Text: Meta Llama 3.1 Instruct - 70B",
+        "Text: Meta Llama 3.1 Instruct - 8B",
+        "Text: Meta Llama 3.2 Instruct - 3B",
+        "Text: Gemma 2 Instruct - 9B",
+        "Text: Mixtral MoE Instruct - 8x22B",
+        "Text: Mixtral MoE Instruct - 8x7B",
+        "Text: MythoMax L2 - 13B"
+    ], index=5)  # Default to Mixtral MoE Instruct - 8x7B
+    st.image(squirtel_image, width=80)
+    option_3 = st.selectbox("Select Model 3", [
+        "Text: Meta Llama 3.1 Instruct - 70B",
+        "Text: Meta Llama 3.1 Instruct - 8B",
+        "Text: Meta Llama 3.2 Instruct - 3B",
+        "Text: Gemma 2 Instruct - 9B",
+        "Text: Mixtral MoE Instruct - 8x22B",
+        "Text: Mixtral MoE Instruct - 8x7B",
+        "Text: MythoMax L2 - 13B"
+    ], index=0)  # Default to Gemma 2 Instruct - 9B
+    # Dropdown to select the LLM that will perform the comparison
+    st.image(ash_image, width=80)
+    comparison_llm = st.selectbox("Select Comparison Model", [
+        "Text: Meta Llama 3.1 Instruct - 70B",
+        "Text: Meta Llama 3.1 Instruct - 8B",
+        "Text: Meta Llama 3.2 Instruct - 3B",
+        "Text: Gemma 2 Instruct - 9B",
+        "Text: Mixtral MoE Instruct - 8x22B",
+        "Text: Mixtral MoE Instruct - 8x7B",
+        "Text: MythoMax L2 - 13B"
+    ], index=5) # Default to MythoMax L2 - 13B
+os.environ["FIREWORKS_API_KEY"] = fireworks_api_key
+# Helper text for the prompt
+st.markdown("### Enter your prompt below to generate responses:")
+prompt = st.text_input("Prompt", label_visibility="collapsed")
+st.divider()
+# Function to generate a response from a text model
+def generate_text_response(model_name, prompt):
+    return fireworks.client.ChatCompletion.create(
+        model=model_name,
+        messages=[{
+            "role": "user",
+            "content": prompt,
+        }]
+    )
+# Function to compare the three responses using the selected LLM
+def compare_responses(response_1, response_2, response_3, comparison_model):
+    comparison_prompt = f"Compare the following three responses:\n\nResponse 1: {response_1}\n\nResponse 2: {response_2}\n\nResponse 3: {response_3}\n\nProvide a succinct comparison."
+    comparison_response = fireworks.client.ChatCompletion.create(
+        model=comparison_model,  # Use the selected LLM for comparison
+        messages=[{
+            "role": "user",
+            "content": comparison_prompt,
+        }]
+    )
+    return comparison_response.choices[0].message.content
+# If Generate button is clicked
+if st.button("Generate"):
+    if not fireworks_api_key.strip() or not prompt.strip():
+        st.error("Please provide the missing fields.")
+    else:
+        try:
+            with st.spinner("Please wait..."):
+                fireworks.client.api_key = fireworks_api_key
+                # Create three columns for side-by-side comparison
+                col1, col2, col3 = st.columns(3)
+                # Model 1
+                with col1:
+                    st.subheader(f"Model 1: {option_1}")
+                    st.image(bulbasaur_image)
+                    if option_1.startswith("Text"):
+                        model_map = {
+                            "Text: Meta Llama 3.1 Instruct - 70B": "accounts/fireworks/models/llama-v3p1-70b-instruct",
+                            "Text: Meta Llama 3.1 Instruct - 8B": "accounts/fireworks/models/llama-v3p1-8b-instruct",
+                            "Text: Meta Llama 3.2 Instruct - 3B": "accounts/fireworks/models/llama-v3p2-3b-instruct",
+                            "Text: Gemma 2 Instruct - 9B": "accounts/fireworks/models/gemma2-9b-it",
+                            "Text: Mixtral MoE Instruct - 8x22B": "accounts/fireworks/models/mixtral-8x22b-instruct",
+                            "Text: Mixtral MoE Instruct - 8x7B": "accounts/fireworks/models/mixtral-8x7b-instruct",
+                            "Text: MythoMax L2 - 13B": "accounts/fireworks/models/mythomax-l2-13b"
+                        }
+                        response_1 = generate_text_response(model_map[option_1], prompt)
+                        st.success(response_1.choices[0].message.content)
+                # Model 2
+                with col2:
+                    st.subheader(f"Model 2: {option_2}")
+                    st.image(charmander_image)
+                    response_2 = generate_text_response(model_map[option_2], prompt)
+                    st.success(response_2.choices[0].message.content)
+                # Model 3
+                with col3:
+                    st.subheader(f"Model 3: {option_3}")
+                    st.image(squirtel_image)
+                    response_3 = generate_text_response(model_map[option_3], prompt)
+                    st.success(response_3.choices[0].message.content)
+                # Visual divider between model responses and comparison
+                st.divider()
+                # Generate a comparison of the three responses using the selected LLM
+                comparison = compare_responses(
+                    response_1.choices[0].message.content,
+                    response_2.choices[0].message.content,
+                    response_3.choices[0].message.content,
+                    model_map[comparison_llm]
+                )
+                # Display the comparison
+                st.subheader("Comparison of the Three Responses:")
+                st.image(ash_image)
+                st.write(comparison)
+        except Exception as e:
+            st.exception(f"Exception: {e}")

pages/2_Parameter_Exploration_for_LLMs.py ADDED Viewed

	@@ -0,0 +1,293 @@

+from dotenv import load_dotenv
+import os
+from PIL import Image
+import random
+import streamlit as st
+import fireworks.client
+# Set page configuration
+st.set_page_config(page_title="LLM Parameters Comparison", page_icon="🎇")
+st.title("Understanding the Completions Chat API parameters")
+st.write("Compare LLM responses with different sets of parameters and evaluate the results using an LLM-as-a-judge.")
+st.markdown("Check out our [Chat Completions API Documentation](https://docs.fireworks.ai/api-reference/post-chatcompletions) for more information on the parameters.")
+# Add expandable section for parameter descriptions
+with st.expander("Parameter Descriptions", expanded=False):
+    st.markdown("""
+    **Max Tokens**: Maximum number of tokens the model can generate.<br>
+    **Prompt Truncate Length**: Number of tokens from the input prompt considered.<br>
+    **Temperature**: Controls randomness of the output.<br>
+    **Top-p (Nucleus Sampling)**: Cumulative probability of token selection.<br>
+    **Top-k**: Limits the number of tokens sampled.<br>
+    **Frequency Penalty**: Discourages repeated words or phrases.<br>
+    **Presence Penalty**: Encourages new topics.<br>
+    **Stop Sequence**: Defines when to stop generating tokens.
+    """, unsafe_allow_html=True)
+# Load environment variables
+dotenv_path = os.path.join(os.path.dirname(__file__), '..', 'env', '.env')
+load_dotenv(dotenv_path)
+# Get the Fireworks API key from environment variables
+fireworks_api_key = os.getenv("FIREWORKS_API_KEY")
+if not fireworks_api_key:
+    raise ValueError("No API key found in the .env file. Please add your FIREWORKS_API_KEY to the .env file.")
+os.environ["FIREWORKS_API_KEY"] = fireworks_api_key
+# Load the images
+logo_image = Image.open("img/fireworksai_logo.png")
+bulbasaur_image = Image.open("img/bulbasaur.png")
+charmander_image = Image.open("img/charmander.png")
+squirtel_image = Image.open("img/squirtel.png")
+ash_image = Image.open("img/ash.png")
+# Map models to their respective identifiers
+model_map = {
+    "Text: Meta Llama 3.1 Instruct - 70B": "accounts/fireworks/models/llama-v3p1-70b-instruct",
+    "Text: Meta Llama 3.1 Instruct - 8B": "accounts/fireworks/models/llama-v3p1-8b-instruct",
+    "Text: Meta Llama 3.2 Instruct - 3B": "accounts/fireworks/models/llama-v3p2-3b-instruct",
+    "Text: Gemma 2 Instruct - 9B": "accounts/fireworks/models/gemma2-9b-it",
+    "Text: Mixtral MoE Instruct - 8x22B": "accounts/fireworks/models/mixtral-8x22b-instruct",
+    "Text: Mixtral MoE Instruct - 8x7B": "accounts/fireworks/models/mixtral-8x7b-instruct",
+    "Text: MythoMax L2 - 13B": "accounts/fireworks/models/mythomax-l2-13b"
+}
+# Function to generate a response from a text model with parameters
+def generate_text_response(model_name, prompt, params):
+    return fireworks.client.ChatCompletion.create(
+        model=model_name,
+        messages=[{
+            "role": "user",
+            "content": prompt,
+        }],
+        max_tokens=params["max_tokens"],
+        temperature=params["temperature"],
+        top_p=params["top_p"],
+        top_k=params["top_k"],
+        frequency_penalty=params["frequency_penalty"],
+        presence_penalty=params["presence_penalty"],
+        stop=params["stop"]
+    )
+# Function to compare the three responses using the selected LLM
+def compare_responses(response_1, response_2, response_3, comparison_model):
+    comparison_prompt = f"Compare the following three responses:\n\nResponse 1: {response_1}\n\nResponse 2: {response_2}\n\nResponse 3: {response_3}\n\nProvide a succinct comparison."
+    comparison_response = fireworks.client.ChatCompletion.create(
+        model=comparison_model,
+        messages=[{
+            "role": "user",
+            "content": comparison_prompt,
+        }]
+    )
+    return comparison_response.choices[0].message.content
+# Slightly randomize parameters for sets 2 and 3
+def randomize_params():
+    return {
+        "max_tokens": random.randint(100, 200),
+        "prompt_truncate_len": random.randint(100, 200),
+        "temperature": round(random.uniform(0.7, 1.3), 2),
+        "top_p": round(random.uniform(0.8, 1.0), 2),
+        "top_k": random.randint(30, 70),
+        "frequency_penalty": round(random.uniform(0, 1), 2),
+        "presence_penalty": round(random.uniform(0, 1), 2),
+        "n": 1,
+        "stop": None
+    }
+# Sidebar for LLM selection, prompt, and judge LLM
+with st.sidebar:
+    st.image(logo_image)
+    # Select the model for generating responses
+    st.subheader("Select LLM for Generating Responses")
+    model = st.selectbox("Select a model for generating responses:", [
+        "Text: Meta Llama 3.1 Instruct - 70B",
+        "Text: Meta Llama 3.1 Instruct - 8B",
+        "Text: Meta Llama 3.2 Instruct - 3B",
+        "Text: Gemma 2 Instruct - 9B",
+        "Text: Mixtral MoE Instruct - 8x22B",
+        "Text: Mixtral MoE Instruct - 8x7B",
+        "Text: MythoMax L2 - 13B"
+    ], index=2)
+    # Placeholder prompts
+    suggested_prompts = [
+        "Prompt 1: Describe the future of AI.",
+        "Prompt 2: Write a short story about a cat who becomes the mayor of a small town",
+        "Prompt 3: Write a step-by-step guide to making pancakes from scratch.",
+        "Prompt 4: Generate a grocery list and meal plan for a vegetarian family of four for one week.",
+        "Prompt 5: Generate a story in which a time traveler goes back to Ancient Greece, accidentally introduces modern memes to philosophers like Socrates and Plato, and causes chaos in the philosophical discourse.",
+        "Prompt 6: Create a timeline where dinosaurs never went extinct and developed their own civilizations, and describe their technology and cultural achievements in the year 2024.",
+        "Prompt 7: Explain the concept of Gödel’s incompleteness theorems in the form of a Dr. Seuss poem, using at least 10 distinct rhyme schemes."
+    ]
+    # Selectbox for suggested prompts
+    selected_prompt = st.selectbox("Choose a suggested prompt:", suggested_prompts)
+    # Input box where the user can edit the selected prompt or enter a custom one
+    prompt = st.text_input("Prompt", value=selected_prompt)
+    # Select the LLM for judging the responses
+    st.subheader("Select LLM for Judge")
+    judge_llm = st.selectbox("Select a model to act as the judge:", [
+        "Text: Meta Llama 3.1 Instruct - 70B",
+        "Text: Meta Llama 3.1 Instruct - 8B",
+        "Text: Meta Llama 3.2 Instruct - 3B",
+        "Text: Gemma 2 Instruct - 9B",
+        "Text: Mixtral MoE Instruct - 8x22B",
+        "Text: Mixtral MoE Instruct - 8x7B",
+        "Text: MythoMax L2 - 13B"
+    ], index=2)
+# Create three columns for parameter sets side-by-side
+col1, col2, col3 = st.columns(3)
+# Parameters for Output 1 (Bulbasaur image)
+with col1:
+    st.subheader("Parameter Set #1")
+    st.image(bulbasaur_image, width=100)  # Bulbasaur image
+    max_tokens_1 = st.slider("Max Tokens", 50, 1000, 123)
+    prompt_truncate_len_1 = st.slider("Prompt Truncate Length", 50, 200, 123)
+    temperature_1 = st.slider("Temperature", 0.1, 2.0, 1.0)
+    top_p_1 = st.slider("Top-p", 0.0, 1.0, 1.0)
+    top_k_1 = st.slider("Top-k", 0, 100, 50)
+    frequency_penalty_1 = st.slider("Frequency Penalty", 0.0, 2.0, 0.0)
+    presence_penalty_1 = st.slider("Presence Penalty", 0.0, 2.0, 0.0)
+    stop_1 = st.text_input("Stop Sequence", "")
+    params_1 = {
+        "max_tokens": max_tokens_1,
+        "prompt_truncate_len": prompt_truncate_len_1,
+        "temperature": temperature_1,
+        "top_p": top_p_1,
+        "top_k": top_k_1,
+        "frequency_penalty": frequency_penalty_1,
+        "presence_penalty": presence_penalty_1,
+        "n": 1,
+        "stop": stop_1 if stop_1 else None
+    }
+# Parameters for Output 2 (Charmander image)
+with col2:
+    st.subheader("Parameter Set #2")
+    st.image(charmander_image, width=100)  # Charmander image
+    use_random_2 = st.checkbox("Randomize parameters for Output 2", value=True)
+    if use_random_2:
+        params_2 = randomize_params()
+        st.write("**Random Parameters for Output 2:**")
+        st.json(params_2)  # Display random params
+    else:
+        max_tokens_2 = st.slider("Max Tokens (Output 2)", 50, 1000, 150)
+        prompt_truncate_len_2 = st.slider("Prompt Truncate Length (Output 2)", 50, 200, 150)
+        temperature_2 = st.slider("Temperature (Output 2)", 0.1, 2.0, 0.9)
+        top_p_2 = st.slider("Top-p (Output 2)", 0.0, 1.0, 0.95)
+        top_k_2 = st.slider("Top-k (Output 2)", 0, 100, 45)
+        frequency_penalty_2 = st.slider("Frequency Penalty (Output 2)", 0.0, 2.0, 0.1)
+        presence_penalty_2 = st.slider("Presence Penalty (Output 2)", 0.0, 2.0, 0.1)
+        stop_2 = st.text_input("Stop Sequence (Output 2)", "")
+        params_2 = {
+            "max_tokens": max_tokens_2,
+            "prompt_truncate_len": prompt_truncate_len_2,
+            "temperature": temperature_2,
+            "top_p": top_p_2,
+            "top_k": top_k_2,
+            "frequency_penalty": frequency_penalty_2,
+            "presence_penalty": presence_penalty_2,
+            "n": 1,
+            "stop": stop_2 if stop_2 else None
+        }
+# Parameters for Output 3 (Squirtle image)
+with col3:
+    st.subheader("Parameter Set #3")
+    st.image(squirtel_image, width=100)  # Squirtle image
+    use_random_3 = st.checkbox("Randomize parameters for Output 3", value=True)
+    if use_random_3:
+        params_3 = randomize_params()
+        st.write("**Random Parameters for Output 3:**")
+        st.json(params_3)  # Display random params
+    else:
+        max_tokens_3 = st.slider("Max Tokens (Output 3)", 50, 1000, 180)
+        prompt_truncate_len_3 = st.slider("Prompt Truncate Length (Output 3)", 50, 200, 140)
+        temperature_3 = st.slider("Temperature (Output 3)", 0.1, 2.0, 1.1)
+        top_p_3 = st.slider("Top-p (Output 3)", 0.0, 1.0, 0.85)
+        top_k_3 = st.slider("Top-k (Output 3)", 0, 100, 60)
+        frequency_penalty_3 = st.slider("Frequency Penalty (Output 3)", 0.0, 2.0, 0.05)
+        presence_penalty_3 = st.slider("Presence Penalty (Output 3)", 0.0, 2.0, 0.2)
+        stop_3 = st.text_input("Stop Sequence (Output 3)", "")
+        params_3 = {
+            "max_tokens": max_tokens_3,
+            "prompt_truncate_len": prompt_truncate_len_3,
+            "temperature": temperature_3,
+            "top_p": top_p_3,
+            "top_k": top_k_3,
+            "frequency_penalty": frequency_penalty_3,
+            "presence_penalty": presence_penalty_3,
+            "n": 1,
+            "stop": stop_3 if stop_3 else None
+        }
+# Divider above generate button
+st.divider()
+# Generate button and logic
+st.subheader("Just hit play")
+st.write("See the effect of selecting parameters on the responses.")
+if st.button("Generate"):
+    if not fireworks_api_key.strip() or not prompt.strip():
+        st.error("Please provide the missing fields.")
+    else:
+        try:
+            with st.spinner("Please wait..."):
+                fireworks.client.api_key = fireworks_api_key
+                # Generate responses for each set of parameters
+                response_1 = generate_text_response(model_map[model], prompt, params_1)
+                response_2 = generate_text_response(model_map[model], prompt, params_2)
+                response_3 = generate_text_response(model_map[model], prompt, params_3)
+                # Display results in the main section
+                col1, col2, col3 = st.columns(3)
+                with col1:
+                    st.subheader("Response 1")
+                    st.image(bulbasaur_image, width=100)
+                    st.success(response_1.choices[0].message.content)
+                with col2:
+                    st.subheader("Response 2")
+                    st.image(charmander_image, width=100)
+                    st.success(response_2.choices[0].message.content)
+                with col3:
+                    st.subheader("Response 3")
+                    st.image(squirtel_image, width=100)
+                    st.success(response_3.choices[0].message.content)
+                st.divider()
+                # Use the selected LLM as the judge and display Ash image
+                st.subheader("LLM-as-a-Judge Comparison")
+                st.image(ash_image, width=100)
+                comparison = compare_responses(
+                    response_1.choices[0].message.content,
+                    response_2.choices[0].message.content,
+                    response_3.choices[0].message.content,
+                    model_map[judge_llm]
+                )
+                st.write(comparison)
+        except Exception as e:
+            st.exception(f"Exception: {e}")
+# Divider below generate button
+st.divider()

requirements.txt ADDED Viewed

	@@ -0,0 +1,246 @@

+aiohappyeyeballs==2.4.0
+aiohttp==3.10.5
+aiosignal==1.3.1
+alembic==1.13.2
+altair==5.4.1
+annotated-types==0.7.0
+anthropic==0.29.0
+anyio==4.6.0
+appdirs==1.4.4
+appnope==0.1.4
+asgiref==3.8.1
+asttokens==2.4.1
+attrs==24.2.0
+backoff==2.2.1
+bcrypt==4.2.0
+beautifulsoup4==4.12.3
+blinker==1.8.2
+boto3==1.35.24
+botocore==1.35.24
+bs4==0.0.2
+build==1.2.2
+cachetools==5.5.0
+certifi==2024.8.30
+cffi==1.17.1
+chardet==5.2.0
+charset-normalizer==3.3.2
+chroma-hnswlib==0.7.3
+chromadb==0.4.24
+click==8.1.7
+cohere==5.9.4
+coloredlogs==15.0.1
+comm==0.2.2
+crewai==0.32.2
+cryptography==43.0.1
+dataclasses-json==0.6.7
+debugpy==1.8.5
+decorator==5.1.1
+deepdiff==8.0.1
+defusedxml==0.7.1
+Deprecated==1.2.14
+dirtyjson==1.0.8
+distro==1.9.0
+docstring_parser==0.16
+durationpy==0.7
+embedchain==0.1.109
+emoji==2.13.0
+executing==2.1.0
+faiss-cpu==1.8.0.post1
+fastapi==0.115.0
+fastavro==1.9.7
+filelock==3.16.1
+filetype==1.2.0
+fireworks-ai==0.15.3
+flatbuffers==24.3.25
+frozendict==2.4.4
+frozenlist==1.4.1
+fsspec==2024.9.0
+gitdb==4.0.11
+GitPython==3.1.43
+google-api-core==2.20.0
+google-auth==2.35.0
+google-cloud-aiplatform==1.67.1
+google-cloud-bigquery==3.25.0
+google-cloud-core==2.4.1
+google-cloud-resource-manager==1.12.5
+google-cloud-storage==2.18.2
+google-crc32c==1.6.0
+google-resumable-media==2.7.2
+google_search_results==2.4.2
+googleapis-common-protos==1.65.0
+gptcache==0.1.44
+greenlet==3.1.1
+grpc-google-iam-v1==0.13.1
+grpcio==1.66.1
+grpcio-status==1.62.3
+h11==0.14.0
+html5lib==1.1
+httpcore==1.0.5
+httptools==0.6.1
+httpx==0.27.2
+httpx-sse==0.4.0
+huggingface-hub==0.25.0
+humanfriendly==10.0
+idna==3.10
+importlib_metadata==8.4.0
+importlib_resources==6.4.5
+instructor==1.3.3
+ipykernel==6.29.5
+ipython==8.27.0
+jedi==0.19.1
+Jinja2==3.1.4
+jiter==0.4.2
+jmespath==1.0.1
+joblib==1.4.2
+jsonpatch==1.33
+jsonpath-python==1.0.6
+jsonpointer==3.0.0
+jsonref==1.1.0
+jsonschema==4.23.0
+jsonschema-specifications==2023.12.1
+jupyter_client==8.6.3
+jupyter_core==5.7.2
+kubernetes==31.0.0
+langchain==0.1.20
+langchain-anthropic==0.1.10
+langchain-cohere==0.1.5
+langchain-community==0.0.38
+langchain-core==0.1.52
+langchain-openai==0.1.7
+langchain-text-splitters==0.0.2
+langdetect==1.0.9
+langsmith==0.1.125
+llama-cloud==0.0.6
+llama-index-core==0.10.50.post1
+llama-index-readers-file==0.1.25
+llama-parse==0.4.4
+lxml==5.3.0
+Mako==1.3.5
+markdown-it-py==3.0.0
+MarkupSafe==2.1.5
+marshmallow==3.22.0
+matplotlib-inline==0.1.7
+mdurl==0.1.2
+mmh3==5.0.1
+monotonic==1.6
+mpmath==1.3.0
+multidict==6.1.0
+multitasking==0.0.11
+mypy-extensions==1.0.0
+narwhals==1.8.2
+nest-asyncio==1.6.0
+networkx==3.3
+nltk==3.9.1
+numpy==1.26.4
+oauthlib==3.2.2
+onnxruntime==1.19.2
+openai==1.47.0
+opentelemetry-api==1.27.0
+opentelemetry-exporter-otlp-proto-common==1.27.0
+opentelemetry-exporter-otlp-proto-grpc==1.27.0
+opentelemetry-exporter-otlp-proto-http==1.27.0
+opentelemetry-instrumentation==0.48b0
+opentelemetry-instrumentation-asgi==0.48b0
+opentelemetry-instrumentation-fastapi==0.48b0
+opentelemetry-proto==1.27.0
+opentelemetry-sdk==1.27.0
+opentelemetry-semantic-conventions==0.48b0
+opentelemetry-util-http==0.48b0
+orderly-set==5.2.2
+orjson==3.10.7
+overrides==7.7.0
+packaging==23.2
+pandas==2.2.3
+parameterized==0.9.0
+parso==0.8.4
+peewee==3.17.6
+pexpect==4.9.0
+pillow==10.4.0
+platformdirs==4.3.6
+posthog==3.6.6
+prettytable==3.11.0
+prompt_toolkit==3.0.47
+proto-plus==1.24.0
+protobuf==4.25.5
+psutil==6.0.0
+ptyprocess==0.7.0
+pulsar-client==3.5.0
+pure_eval==0.2.3
+pyarrow==17.0.0
+pyasn1==0.6.1
+pyasn1_modules==0.4.1
+pycparser==2.22
+pydantic==2.9.2
+pydantic_core==2.23.4
+pydeck==0.9.1
+Pygments==2.18.0
+pypdf==4.3.1
+PyPika==0.48.9
+pyproject_hooks==1.1.0
+pysbd==0.3.4
+python-dateutil==2.9.0.post0
+python-dotenv==1.0.1
+python-iso639==2024.4.27
+python-magic==0.4.27
+pytz==2024.2
+PyYAML==6.0.2
+pyzmq==26.2.0
+rapidfuzz==3.9.7
+referencing==0.35.1
+regex==2023.12.25
+requests==2.32.3
+requests-oauthlib==2.0.0
+requests-toolbelt==1.0.0
+rich==13.8.1
+rpds-py==0.20.0
+rsa==4.9
+s3transfer==0.10.2
+safetensors==0.4.5
+schema==0.7.7
+scikit-learn==1.5.2
+scipy==1.14.1
+sec-api==1.0.18
+sentence-transformers==3.1.1
+setuptools==75.1.0
+shapely==2.0.6
+shellingham==1.5.4
+six==1.16.0
+smmap==5.0.1
+sniffio==1.3.1
+soupsieve==2.6
+SQLAlchemy==2.0.35
+stack-data==0.6.3
+starlette==0.38.6
+streamlit==1.38.0
+striprtf==0.0.26
+sympy==1.13.3
+tabulate==0.9.0
+tenacity==8.5.0
+threadpoolctl==3.5.0
+tiktoken==0.7.0
+tokenizers==0.19.1
+toml==0.10.2
+torch==2.4.1
+tornado==6.4.1
+tqdm==4.66.5
+traitlets==5.14.3
+transformers==4.44.2
+typer==0.12.5
+types-requests==2.32.0.20240914
+typing-inspect==0.9.0
+typing_extensions==4.12.2
+tzdata==2024.1
+unstructured==0.14.8
+unstructured-client==0.25.9
+urllib3==2.2.3
+uvicorn==0.30.6
+uvloop==0.20.0
+watchfiles==0.24.0
+wcwidth==0.2.13
+webencodings==0.5.1
+websocket-client==1.8.0
+websockets==13.1
+wrapt==1.16.0
+yarl==1.11.1
+yfinance==0.2.40
+zipp==3.20.2