Large as in what number?
#1
by
Delcos
- opened
Does this use the same naming scheme as DialoGPT. If not what does the large stand for?
Yeah actually they're both called "large" but they differ a bit in terms of hyperparameters.
BioGPT-large has 48 layers, and uses a hidden size of 1600. You can check the config for all details.
DialoGPT-large on the other hand has 36 layers, and uses a hidden size of 1280. See also the config for all details.
Awesome thanks :) .
Delcos
changed discussion status to
closed