ClassCat
/

gpt2-small-catalan-v2

Text Generation

text-generation-inference

Model card Files Files and versions Community

GPT2 Catalan small model Version 2 (Uncased)

Prerequisites

transformers==4.19.2

Model architecture

This model uses GPT2 base model settings, but the size of embedding dimensions are half the size of them.

Tokenizer

Using BPE tokenizer with vocabulary size 50,000.

Training Data

wiki40b/ca (Catalan Wikipedia)
Subset of oscar
Subset of CC-100/ca : Monolingual Datasets from Web Crawl Data

Usage

from transformers import pipeline

unmasker = pipeline('fill-mask', model='ClassCat/gpt2-small-catalan-v2')
unmasker("Ell està una mica")

Downloads last month: 58

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train ClassCat/gpt2-small-catalan-v2