Hyperparameters:

  • learning rate: 2e-5
  • weight decay: 0.01
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps:1
  • eval steps: 50000
  • max_length: 512
  • num_epochs: 1
  • hidden_dropout_prob: 0.3
  • attention_probs_dropout_prob: 0.25

Dataset version:

  • taskydata/tasky_or_not/v_1

Checkpoint:

  • 455000 steps.

Results on Validation set:

Step Training Loss Validation Loss Accuracy Precision Recall F1
50000 0.0148 0.10890 0.9798 0.9755 0.9843 0.9799
100000 0.0121 0.09090 0.9863 0.9958 0.9767 0.9862
150000 0.0080 0.11800 0.9863 0.9779 0.9950 0.9864
200000 0.0116 0.08965 0.9877 0.9905 0.9848 0.9876
250000 0.0073 3.50100 0.6507 0.5905 0.9830 0.7378
300000 0.0072 0.09807 0.9850 0.9863 0.9870 0.9849
350000 0.0053 0.09830 0.9854 0.9939 0.9870 0.9852
400000 0.0046 0.08130 0.9893 0.9957 0.9828 0.9892
450000 0.0054 0.61280 0.9095 0.5835 0.9888 0.9162
455000 0.0055 0.15790 0.9710 0.9561 0.9874 0.9715

Uploaded Checkpoint:

  • 400000
Downloads last month
30
Safetensors
Model size
184M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train taskydata/deberta-v3-base_v_1