import streamlit as st st.set_page_config( page_title="JobFair: Fairness Benchmark", page_icon="👋", ) st.title('JobFair: A Benchmark for Fairness in LLM Employment Decision-Making') st.write("Welcome to JobFair! This benchmark is designed to evaluate the fairness of language models in employment decision-making. Our goal is to provide a comprehensive tool for analyzing potential biases in how language models score resumes and make hiring recommendations.") st.markdown( """ ## About JobFair The JobFair benchmark enables users to: - **Upload and process** resumes to be evaluated by language models. - **Analyze fairness** through various statistical tests, correlations, and divergences. - **Download detailed evaluation results** for further review and reporting. ### Key Features - **Fairness Analysis**: Perform a variety of statistical tests to uncover potential biases in language model evaluations. - **Comprehensive Reporting**: Generate detailed reports on the fairness of LLMs, including visualizations and downloadable data. - **User-Friendly Interface**: Easily upload data, run analyses, and download results through an intuitive web interface. ### How to Use 1. **Upload Data**: Start by uploading a CSV file containing the resumes and their respective scores. 2. **Run Evaluations**: Use the provided tools to perform statistical analyses and visualize the results. 3. **Download Results**: Export the analysis results for further examination and reporting. We hope JobFair helps you in making more informed and fair employment decisions using language models. """ ) # Sidebar content st.sidebar.title("Demos") st.sidebar.subheader("Injection Demo") st.sidebar.markdown( """ In this demo, you can upload a dataset of resumes and use our language models to process and score them based on various parameters. - **Model Settings**: Configure your model settings by selecting the type of agent (GPTAgent or AzureAgent), and specifying the API key, endpoint URL, model name, temperature, and max tokens. - **Data Upload**: Choose to upload your own CSV file or use an example dataset. - **Process Data**: Enter the relevant details such as occupation, group name, privilege label, and protect label. Specify the number of runs and process the data to get the model's scores. - **Download Results**: After processing, download the generated results as a CSV file. """ ) st.sidebar.subheader("Evaluation Demo") st.sidebar.markdown( """ In this demo, you can evaluate the fairness of the scores generated by the language models. - **Upload Results**: Upload the CSV file containing the processed results from the injection demo. - **Statistical Tests**: Perform a variety of statistical tests to evaluate potential biases in the scores. - **Correlations and Divergences**: Calculate correlations and divergences to further analyze the fairness of the results. - **Download Evaluation**: Download the comprehensive evaluation results for further analysis. """ )