--- tags: - generated_from_trainer model-index: - name: BERT_word2vec results: [] --- # BERT_word2vec This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 2.2223 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 256 - eval_batch_size: 128 - seed: 42 - distributed_type: multi-GPU - num_devices: 8 - total_train_batch_size: 2048 - total_eval_batch_size: 1024 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 20 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:------:|:---------------:| | 3.4036 | 1.0 | 43283 | 3.2746 | | 3.0571 | 2.0 | 86566 | 2.9331 | | 2.8967 | 3.0 | 129849 | 2.7814 | | 2.7919 | 4.0 | 173132 | 2.6885 | | 2.7072 | 5.0 | 216415 | 2.6176 | | 2.6512 | 6.0 | 259698 | 2.5633 | | 2.6091 | 7.0 | 302981 | 2.5193 | | 2.5596 | 8.0 | 346264 | 2.4826 | | 2.5291 | 9.0 | 389547 | 2.4491 | | 2.4972 | 10.0 | 432830 | 2.4219 | | 2.4697 | 11.0 | 476113 | 2.3943 | | 2.4311 | 12.0 | 519396 | 2.3714 | | 2.4199 | 13.0 | 562679 | 2.3438 | | 2.3847 | 14.0 | 605962 | 2.3223 | | 2.3508 | 15.0 | 649245 | 2.3042 | | 2.3333 | 16.0 | 692528 | 2.2818 | | 2.3113 | 17.0 | 735811 | 2.2633 | | 2.281 | 18.0 | 779094 | 2.2447 | | 2.2749 | 19.0 | 822377 | 2.2316 | | 2.2541 | 20.0 | 865660 | 2.2223 | ### Framework versions - Transformers 4.39.3 - Pytorch 2.2.2+cu121 - Datasets 2.18.0 - Tokenizers 0.15.2