metadata

title: Huggingface Multi Inference Rank Eval
emoji: 🤔
colorFrom: yellow
colorTo: purple
sdk: streamlit
sdk_version: 1.10.0
app_file: ./app/main_page.py
pinned: false
license: cc

huggingface_multi_inference_rank_eval

This app lets the users play around with Huggingface models on a prommpted multiple choice QA inference.

Recenet researches like GPT-3, FLAN, and T0 showed a promising direction in zero-shot generalization via prompting. Rathern than pretraining on a huge dataset and finetuning a model on downstream task, we can directly ask a question to a model by prompting a model on a task!

This app lets users interact with various models hosted on Huggingface models and ask a multiple choice question. Models rank the choices with their log probabilities and pick the choice with highest log probability. We currently supoort CausalLM and Seq2SeqLM.

e.g.

Question: Huggingface is awesome. True or False?
Answer choices: True, False
Prediction: True

You can access the hosted version.

Setup

Clone the repository, install the required packages and run:

streamlit run ./app/main_page.py

Reference

A chunk of codes used for this projects is taken and/or insipred from the following works and their related repository:

@inproceedings{sanh2022multitask,
    title={Multitask Prompted Training Enables Zero-Shot Task Generalization},
    author={Victor Sanh and Albert Webson and Colin Raffel and Stephen Bach and Lintang Sutawika and Zaid Alyafeai and Antoine Chaffin and Arnaud Stiegler and Arun Raja and Manan Dey and M Saiful Bari and Canwen Xu and Urmish Thakker and Shanya Sharma Sharma and Eliza Szczechla and Taewoon Kim and Gunjan Chhablani and Nihal Nayak and Debajyoti Datta and Jonathan Chang and Mike Tian-Jian Jiang and Han Wang and Matteo Manica and Sheng Shen and Zheng Xin Yong and Harshit Pandey and Rachel Bawden and Thomas Wang and Trishala Neeraj and Jos Rozen and Abheesht Sharma and Andrea Santilli and Thibault Fevry and Jason Alan Fries and Ryan Teehan and Teven Le Scao and Stella Biderman and Leo Gao and Thomas Wolf and Alexander M Rush},
    booktitle={International Conference on Learning Representations},
    year={2022}
    url={https://openreview.net/forum?id=9Vrb9D0WI4}

@software{eval-harness,
  author       = {Gao, Leo and
                  Tow, Jonathan and
                  Biderman, Stella and
                  Black, Sid and
                  DiPofi, Anthony and
                  Foster, Charles and
                  Golding, Laurence and
                  Hsu, Jeffrey and
                  McDonell, Kyle and
                  Muennighoff, Niklas and
                  Phang, Jason and
                  Reynolds, Laria and
                  Tang, Eric and
                  Thite, Anish and
                  Wang, Ben and
                  Wang, Kevin and
                  Zou, Andy},
  title        = {A framework for few-shot language model evaluation},
  month        = sep,
  year         = 2021,
  publisher    = {Zenodo},
  version      = {v0.0.1},
  doi          = {10.5281/zenodo.5371628},
  url          = {https://doi.org/10.5281/zenodo.5371628}
}

For style,

https://fossheim.io/writing/posts/css-text-gradient/
https://css-tricks.com/css-hover-effects-background-masks-3d/