Spaces:
Sleeping
Sleeping
File size: 3,118 Bytes
0b5c5aa 117a821 823c0be 64703c4 823c0be 64703c4 823c0be cf25467 64703c4 cf25467 64703c4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
import streamlit as st
st.set_page_config(
page_title="JobFair: Fairness Benchmark",
page_icon="👋",
)
st.title('JobFair: A Benchmark for Fairness in LLM Employment Decision-Making')
st.write("Welcome to JobFair! This benchmark is designed to evaluate the fairness of language models in employment decision-making. Our goal is to provide a comprehensive tool for analyzing potential biases in how language models score resumes and make hiring recommendations.")
st.markdown(
"""
## About JobFair
The JobFair benchmark enables users to:
- **Upload and process** resumes to be evaluated by language models.
- **Analyze fairness** through various statistical tests, correlations, and divergences.
- **Download detailed evaluation results** for further review and reporting.
### Key Features
- **Fairness Analysis**: Perform a variety of statistical tests to uncover potential biases in language model evaluations.
- **Comprehensive Reporting**: Generate detailed reports on the fairness of LLMs, including visualizations and downloadable data.
- **User-Friendly Interface**: Easily upload data, run analyses, and download results through an intuitive web interface.
### How to Use
1. **Upload Data**: Start by uploading a CSV file containing the resumes and their respective scores.
2. **Run Evaluations**: Use the provided tools to perform statistical analyses and visualize the results.
3. **Download Results**: Export the analysis results for further examination and reporting.
We hope JobFair helps you in making more informed and fair employment decisions using language models.
"""
)
# Sidebar content
st.sidebar.title("Demos")
st.sidebar.subheader("Injection Demo")
st.sidebar.markdown(
"""
In this demo, you can upload a dataset of resumes and use our language models to process and score them based on various parameters.
- **Model Settings**: Configure your model settings by selecting the type of agent (GPTAgent or AzureAgent), and specifying the API key, endpoint URL, model name, temperature, and max tokens.
- **Data Upload**: Choose to upload your own CSV file or use an example dataset.
- **Process Data**: Enter the relevant details such as occupation, group name, privilege label, and protect label. Specify the number of runs and process the data to get the model's scores.
- **Download Results**: After processing, download the generated results as a CSV file.
"""
)
st.sidebar.subheader("Evaluation Demo")
st.sidebar.markdown(
"""
In this demo, you can evaluate the fairness of the scores generated by the language models.
- **Upload Results**: Upload the CSV file containing the processed results from the injection demo.
- **Statistical Tests**: Perform a variety of statistical tests to evaluate potential biases in the scores.
- **Correlations and Divergences**: Calculate correlations and divergences to further analyze the fairness of the results.
- **Download Evaluation**: Download the comprehensive evaluation results for further analysis.
"""
)
|