gbert-large-autopart
This model is a fine-tuned version of deepset/gbert-large on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.3832
Model description
this model contains a domain adaptation of the model deepset/gbert-large using a dataset containing 54.000 sample sentences of german top 30 DAX company websites.
Intended uses & limitations
use in classification problems where the samples are from german company websites.
Training and evaluation data
80 percent of the available samples where used for training. Evaluation was performed on 20 percent of the data.
Training procedure
masked language modelling using AutoModelForMaskedLM
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 96
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 16
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.5949 | 1.0 | 445 | 0.5025 |
0.5211 | 2.0 | 890 | 0.4729 |
0.5036 | 3.0 | 1335 | 0.4893 |
0.4916 | 4.0 | 1780 | 0.4647 |
0.4464 | 5.0 | 2225 | 0.4401 |
0.425 | 6.0 | 2670 | 0.4246 |
0.4076 | 7.0 | 3115 | 0.4169 |
0.3962 | 8.0 | 3560 | 0.4140 |
0.3829 | 9.0 | 4005 | 0.4220 |
0.3702 | 10.0 | 4450 | 0.4119 |
0.3566 | 11.0 | 4895 | 0.3993 |
0.3442 | 12.0 | 5340 | 0.3924 |
0.3365 | 13.0 | 5785 | 0.3880 |
0.3316 | 14.0 | 6230 | 0.3900 |
0.3213 | 15.0 | 6675 | 0.3800 |
0.316 | 16.0 | 7120 | 0.3832 |
Framework versions
- Transformers 4.32.1
- Pytorch 2.0.1+cu118
- Datasets 2.14.4
- Tokenizers 0.13.3
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for luciore95/gbert-large-autopart
Base model
deepset/gbert-large