base model: 'HooshvareLab/distilbert-fa-zwnj-base' - trained for 2 epochs with 256 as max_length - 3x faster than the Bert model despite having the same performance model performance on test: ``` {'eval_loss': 0.3385954797267914, 'eval_roc_auc': 0.9378028883850424, 'eval_f1_score': 0.8662723907586265, 'eval_recall': 0.8818815783774419, 'eval_percision': 0.8512061541034324, 'eval_runtime': 204.8524, 'eval_samples_per_second': 229.609, 'eval_steps_per_second': 7.176, 'epoch': 2.0} ```