How does this model compare to e.g. gpt2?

by julien-c HF staff - opened Sep 19, 2023

Discussion

julien-c

Sep 19, 2023

In generation powers

Xenova

Owner Sep 19, 2023

•

edited Sep 19, 2023

The MBZUAI team have a nice repo where they showcase and evaluate performance across a set of architectures:

As their project name ("LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions") suggests, they train these models on output from gpt-3.5-turbo (see data section of README)

The original (python) model can be found here.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment