SaraPiscitelli
/

roberta-base-qa-v1

Question Answering

Inference Endpoints

Model card Files Files and versions Community

SaraPiscitelli commited on Jan 6, 2024

Commit

30575fe

·

1 Parent(s): 91e8d4a

Update README.md

Files changed (1) hide show

README.md +4 -5

README.md CHANGED Viewed

@@ -151,21 +151,20 @@ This was necessary due to the maximum input token limit accepted by the RoBERTa-
 - **Training regime:** fp32
 - **base_model_name_or_path:** roberta-base
 - **max_tokens_length:** 512
-- **weighted_loss** true
 - **training_arguments:** TrainingArguments(
     output_dir=results_dir,
     num_train_epochs=5,
     per_device_train_batch_size=8,
     per_device_eval_batch_size=8,
     gradient_accumulation_steps=1,
-    learning_rate=0.0001,
     lr_scheduler_type="linear",
     optim="adamw_torch",
     eval_accumulation_steps=1,
     evaluation_strategy="steps",
-    eval_steps=0.01,
     save_strategy="steps",
-    save_steps=0.01,
     logging_strategy="steps",
     logging_steps=1,
     report_to="tensorboard",
@@ -173,7 +172,7 @@ This was necessary due to the maximum input token limit accepted by the RoBERTa-
     do_eval=True,
     max_grad_norm=0.3,
     warmup_ratio=0.03,
-    group_by_length=True,
     dataloader_drop_last=False,
     fp16=False,
     bf16=False

 - **Training regime:** fp32
 - **base_model_name_or_path:** roberta-base
 - **max_tokens_length:** 512
 - **training_arguments:** TrainingArguments(
     output_dir=results_dir,
     num_train_epochs=5,
     per_device_train_batch_size=8,
     per_device_eval_batch_size=8,
     gradient_accumulation_steps=1,
+    learning_rate=0.00001,
     lr_scheduler_type="linear",
     optim="adamw_torch",
     eval_accumulation_steps=1,
     evaluation_strategy="steps",
+    eval_steps=0.2,
     save_strategy="steps",
+    save_steps=0.2,
     logging_strategy="steps",
     logging_steps=1,
     report_to="tensorboard",
     do_eval=True,
     max_grad_norm=0.3,
     warmup_ratio=0.03,
+    #group_by_length=True,
     dataloader_drop_last=False,
     fp16=False,
     bf16=False