Model: codegram/calbert-base-uncased

How to use this model directly from the 🤗/transformers library:

tokenizer = AutoTokenizer.from_pretrained("codegram/calbert-base-uncased") model = AutoModel.from_pretrained("codegram/calbert-base-uncased")

CALBERT: a Catalan Language Model


CALBERT is an open-source language model for Catalan based on the ALBERT architecture.

It is now available on Hugging Face in its base-uncased version, and was pretrained on the OSCAR dataset.

For further information or requests, please go to the GitHub repository

Pre-trained models

Model Arch. Training data
codegram / calbert-base-uncased Base (uncased) OSCAR (4.3 GB of text)


CALBERT was trained and evaluated by Txus Bach, as part of Codegram's applied research.