The foundation this model is the RoBERTa-style model deepset/gbert-large.
Following Gururangan et al. (2020) we gathered a collection of narrative fiction and continued the models pre-training task with it.
The training is performed over 10 epochs on 2.3 GB of text with a learning rate of 0.0001 (linear decrease) and a batch size of 512.

Downloads last month: 14

Inference Providers NEW

Feature Extraction

This model is not currently available via any of the supported Inference Providers.