Back to all models
Model card Files and versions Use in transformers
fill-mask mask_token: <mask>
Query this model
πŸ”₯ This model is currently loaded and running on the Inference API. ⚠️ This model could not be loaded by the inference API. ⚠️ This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint  

⚑️ Upgrade your account to access the Inference API

Share Copied link to clipboard

Contributed by

pradhyra Pradhyumna Ramesh
1 model

This model is pre-trained on blog articles from AWS Blogs.

Pre-training corpora

The input text contains around 3000 blog articles on AWS Blogs website technical subject matter including AWS products, tools and tutorials.

Pre-training details

I picked a Roberta architecture for masked language modeling (6-layer, 768-hidden, 12-heads, 82M parameters) and its corresponding ByteLevelBPE tokenization strategy. I then followed HuggingFace's Transformers blog post to train the model. I chose to follow the following training set-up: 28k training steps with batches of 64 sequences of length 512 with an initial learning rate 5e-5. The model acheived a training loss of 3.6 on the MLM task over 10 epochs.