How does this model compare to e.g. gpt2?

#1
by julien-c HF staff - opened

In generation powers

The MBZUAI team have a nice repo where they showcase and evaluate performance across a set of architectures:

image.png

As their project name ("LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions") suggests, they train these models on output from gpt-3.5-turbo (see data section of README)

The original (python) model can be found here.

Sign up or log in to comment