Edit model card

Model Overview

ELECTRA model is a pretraining approach for language models published by Google. Two transformer models are trained, a generator and a discriminator. The generator replaces tokens in a sequence and is trained as a masked language model. The discriminator is trained to discern what tokens have been replaced. This method of pretraining is more efficient than comparable methods like masked language modeling, especially for small models.

Weights are released under the MIT License. Keras model code is released under the Apache 2 License.

Links

Installation

Keras and KerasHub can be installed with:

pip install -U -q keras-hub
pip install -U -q keras>=3

Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instruction on installing them in another environment see the Keras Getting Started page.

Presets

The following model checkpoints are provided by the Keras team. Full code examples for each are available below.

Preset name Parameters Description
electra_small_discriminator_uncased_en 13.55M 12-layer small ELECTRA discriminator model. All inputs are lowercased. Trained on English Wikipedia + BooksCorpus.
electra_small_generator_uncased_en 13.55M 12-layer small ELECTRA generator model. All inputs are lowercased. Trained on English Wikipedia + BooksCorpus.
electra_base_discriminator_uncased_en 109.48M 12-layer base ELECTRA discriminator model. All inputs are lowercased. Trained on English Wikipedia + BooksCorpus.
electra_base_generator_uncased_en 33.58M 12-layer base ELECTRA generator model. All inputs are lowercased. Trained on English Wikipedia + BooksCorpus.
electra_large_discriminator_uncased_en 335.14M 24-layer large ELECTRA discriminator model. All inputs are lowercased. Trained on English Wikipedia + BooksCorpus.
electra_large_generator_uncased_en 51.07M 24-layer large ELECTRA generator model. All inputs are lowercased. Trained on English Wikipedia + BooksCorpus.
Downloads last month
23
Inference Examples
Inference API (serverless) does not yet support keras-hub models for this pipeline type.

Collection including keras/electra_large_discriminator_uncased_en