Zero shot comparison with Instruct-GPT-3 ?

#19
by nishanthcmesh - opened

Really impressed with this model and the open source effort here. I can see that the paper has some zero-shot metrics on held out tasks, I wanted to know if there is a comparison to OpenAI's InstructGPT-3 available anywhere? It would be a game changer for the AI community if this model performs comparably to it.

BigScience Workshop org

Really impressed with this model and the open source effort here. I can see that the paper has some zero-shot metrics on held out tasks, I wanted to know if there is a comparison to OpenAI's InstructGPT-3 available anywhere? It would be a game changer for the AI community if this model performs comparably to it.

The only dataset I found that both InstructGPT & BLOOMZ are evaluated on is RTE. It looks like BLOOMZ is better zero-shot on RTE (BLOOMZ: >80, while InstructGPT is ~70, see screenshots). Albeit just one datapoint may not be super meaningful.

Table 14 InstructGPT:
Screenshot 2022-11-09 at 20.36.19.png

Table 7 BLOOMZ & fam; RTE highlighted in blue:

Screenshot 2022-11-09 at 20.38.14.png

Looks like that's the only data point there is then. Thanks for going through the tables and finding that out!

cakiki changed discussion status to closed

Sign up or log in to comment