File size: 1,896 Bytes
d69c5d7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
import streamlit as st

st.set_page_config(page_title="LLM Evaluation Lab – by Fikran", layout="wide")

st.title("πŸ”¬ LLM Evaluation Lab – Real-world Testing by Fikran")
st.markdown(
    """
    Welcome to the interactive showcase for evaluating **Large Language Models** (LLMs) using real-world user interactions!

    πŸš€ **What this is**: A companion demo to [Fikran](https://fikran.com), a multilingual platform where people interact with AI agents in a natural, dynamic environment.  
    βœ… We offer model testing, benchmarking, and prompt refinement in live contextsβ€”not just synthetic benchmarks.

    ---
    """
)

st.header("🎯 What You Can Do Here")
st.markdown(
    """
    - βœ… Understand the **real-world performance** of your LLMs.
    - πŸ“Š Track how they behave across different user queries and scenarios.
    - πŸ”§ Apply **prompt engineering**, **LoRA-based customization**, and **dialogue tuning**.
    - πŸ“‚ Export insights as PDF reports or structured logs.

    This system is perfect for:
    - Researchers evaluating fine-tuned models  
    - Product teams testing chatbot behavior before deployment  
    - Prompt engineers experimenting with multi-agent setups  
    """
)

st.header("πŸ› οΈ Try It or Order a Full Evaluation")

col1, col2 = st.columns(2)

with col1:
    if st.button("🌐 Try Fikran Now"):
        st.markdown("[Click to explore Fikran](https://www.fikran.com/terms/about-us?lang=english)", unsafe_allow_html=True)

with col2:
    if st.button("πŸ“¦ Order a Full Evaluation"):
        st.markdown("[See the Service on Upwork](https://www.upwork.com/services/product/development-it-a-real-world-evaluation-of-your-llm-in-a-dynamic-interactive-environment-1909379987479305454)", unsafe_allow_html=True)

st.markdown("---")

st.info("This app is maintained by [Abdennacer Elbasri](https://huggingface.co/elbasri), founder of Fikran.")