Spaces:

elbasri
/

llm-eval-lab

Sleeping

App Files Files Community

elbasri commited on 30 days ago

Commit

d69c5d7

1 Parent(s): 3211b80

Initial streamlit app - fikran

Browse files

Files changed (1) hide show

app.py +47 -0

app.py ADDED Viewed

	@@ -0,0 +1,47 @@

+import streamlit as st
+st.set_page_config(page_title="LLM Evaluation Lab – by Fikran", layout="wide")
+st.title("🔬 LLM Evaluation Lab – Real-world Testing by Fikran")
+st.markdown(
+    """
+    Welcome to the interactive showcase for evaluating **Large Language Models** (LLMs) using real-world user interactions!
+    🚀 **What this is**: A companion demo to [Fikran](https://fikran.com), a multilingual platform where people interact with AI agents in a natural, dynamic environment.
+    ✅ We offer model testing, benchmarking, and prompt refinement in live contexts—not just synthetic benchmarks.
+    ---
+    """
+)
+st.header("🎯 What You Can Do Here")
+st.markdown(
+    """
+    - ✅ Understand the **real-world performance** of your LLMs.
+    - 📊 Track how they behave across different user queries and scenarios.
+    - 🔧 Apply **prompt engineering**, **LoRA-based customization**, and **dialogue tuning**.
+    - 📂 Export insights as PDF reports or structured logs.
+    This system is perfect for:
+    - Researchers evaluating fine-tuned models
+    - Product teams testing chatbot behavior before deployment
+    - Prompt engineers experimenting with multi-agent setups
+    """
+)
+st.header("🛠️ Try It or Order a Full Evaluation")
+col1, col2 = st.columns(2)
+with col1:
+    if st.button("🌐 Try Fikran Now"):
+        st.markdown("[Click to explore Fikran](https://www.fikran.com/terms/about-us?lang=english)", unsafe_allow_html=True)
+with col2:
+    if st.button("📦 Order a Full Evaluation"):
+        st.markdown("[See the Service on Upwork](https://www.upwork.com/services/product/development-it-a-real-world-evaluation-of-your-llm-in-a-dynamic-interactive-environment-1909379987479305454)", unsafe_allow_html=True)
+st.markdown("---")
+st.info("This app is maintained by [Abdennacer Elbasri](https://huggingface.co/elbasri), founder of Fikran.")