--- tags: - generated_from_trainer model-index: - name: legal-xlm-roberta-base results: [] --- # legal-xlm-roberta-base This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.5484 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - distributed_type: tpu - num_devices: 8 - gradient_accumulation_steps: 4 - total_train_batch_size: 512 - total_eval_batch_size: 128 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_ratio: 0.05 - training_steps: 1000000 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:-------:|:---------------:| | 1.2285 | 0.05 | 50000 | 0.9298 | | 1.0417 | 0.1 | 100000 | 0.7723 | | 0.9525 | 0.15 | 150000 | 0.7258 | | 0.9668 | 0.2 | 200000 | 0.6884 | | 0.8949 | 0.25 | 250000 | 0.6714 | | 0.921 | 0.3 | 300000 | 0.6617 | | 0.8324 | 0.35 | 350000 | 0.6423 | | 0.8406 | 0.4 | 400000 | 0.6259 | | 0.8136 | 0.45 | 450000 | 0.6147 | | 0.8247 | 0.5 | 500000 | 0.6095 | | 0.8649 | 0.55 | 550000 | 0.5985 | | 0.8119 | 0.6 | 600000 | 0.5973 | | 0.8422 | 0.65 | 650000 | 0.5813 | | 0.8006 | 0.7 | 700000 | 0.5701 | | 0.8072 | 0.75 | 750000 | 0.5662 | | 0.8154 | 0.8 | 800000 | 0.5514 | | 0.7794 | 0.85 | 850000 | 0.5562 | | 0.7924 | 0.9 | 900000 | 0.5558 | | 0.8207 | 0.95 | 950000 | 0.5587 | | 0.8279 | 1.0 | 1000000 | 0.5484 | ### Framework versions - Transformers 4.20.1 - Pytorch 1.12.0+cu102 - Datasets 2.8.0 - Tokenizers 0.12.0