Spaces:
Runtime error
Runtime error
# Automatic Model Cards for Large Language Models | |
_Blair Yang, Scott Cui, Silviu Pitis, Michael R Zhang, Keiran Paster, Pashootan Vaezipoor, Sheila McIlraith, Jimmy Ba_ | |
Welcome to the _"guessing game"_ of the paper Automatic Model Cards for Large Language Models evaluation. This interactive platform is designed to allow users to assess the predictive capabilities of our LLM-written model cards through an engaging process. | |
To utilize this system, please follow these steps: | |
- **Select a Dataset and Topic**: Choose from the available list to set the context for your question. | |
- **Review the Evaluation Card**: Read the card detailing the LLM's capabilities relevant to your chosen topic. | |
- **Evaluate the Question**: Determine if you believe the LLM can correctly answer the displayed question based on the Evaluation Card information. | |
- **Make Your Prediction**: Indicate your guess—'Correct' or 'Incorrect'—and click "Submit". | |
- **Optional Explanation**: You may provide reasoning for your guess, but it's not required. | |
- **Check Ground Truth**: After submitting, the correct answer will be shown for you to compare with your guess. |