Spaces:

elbasri
/

llm-eval-lab

Sleeping

App Files Files Community

llm-eval-lab / app.py

elbasri

Initial streamlit app - fikran

d69c5d7 3 months ago

raw

history blame contribute delete

1.9 kB

	import streamlit as st

	st.set_page_config(page_title="LLM Evaluation Lab – by Fikran", layout="wide")

	st.title("🔬 LLM Evaluation Lab – Real-world Testing by Fikran")
	st.markdown(
	"""
	Welcome to the interactive showcase for evaluating Large Language Models (LLMs) using real-world user interactions!

	🚀 What this is: A companion demo to [Fikran](https://fikran.com), a multilingual platform where people interact with AI agents in a natural, dynamic environment.
	✅ We offer model testing, benchmarking, and prompt refinement in live contexts—not just synthetic benchmarks.

	---
	"""
	)

	st.header("🎯 What You Can Do Here")
	st.markdown(
	"""
	- ✅ Understand the real-world performance of your LLMs.
	- 📊 Track how they behave across different user queries and scenarios.
	- 🔧 Apply prompt engineering, LoRA-based customization, and dialogue tuning.
	- 📂 Export insights as PDF reports or structured logs.

	This system is perfect for:
	- Researchers evaluating fine-tuned models
	- Product teams testing chatbot behavior before deployment
	- Prompt engineers experimenting with multi-agent setups
	"""
	)

	st.header("🛠️ Try It or Order a Full Evaluation")

	col1, col2 = st.columns(2)

	with col1:
	if st.button("🌐 Try Fikran Now"):
	st.markdown("[Click to explore Fikran](https://www.fikran.com/terms/about-us?lang=english)", unsafe_allow_html=True)

	with col2:
	if st.button("📦 Order a Full Evaluation"):
	st.markdown("[See the Service on Upwork](https://www.upwork.com/services/product/development-it-a-real-world-evaluation-of-your-llm-in-a-dynamic-interactive-environment-1909379987479305454)", unsafe_allow_html=True)

	st.markdown("---")

	st.info("This app is maintained by [Abdennacer Elbasri](https://huggingface.co/elbasri), founder of Fikran.")