--- license: mit --- # ESM-2 QLoRA for Binding Sites Prediction In this model we added in more QLoRA adapter layers, modifying all of the weight matrices with QLoRA. The differences between the train and test metrics, again, are smaller for this model than for the model with fewer adapter layers (only using query, key, and value matrices). So, we see that adapting more of the weight matrices in this larger ESM-2 model decreases overfitting and serves as a better regularizer. For comparison, see [this model](https://huggingface.co/AmelieSchreiber/esm2_t12_35M_qlora_binding_sites_v0) which only has QLoRA adapters on the query, key, and value matrices. ## Testing for Overfitting ```python Train metrics: {'eval_loss': 0.17861589789390564, 'eval_accuracy': 0.9336392007583741, 'eval_precision': 0.24007189695313816, 'eval_recall': 0.9234520216135872, 'eval_f1': 0.38107489676203077, 'eval_auc': 0.9286608447868842, 'eval_mcc': 0.4519203165484902} Test metrics: {'eval_loss': 0.2265990674495697, 'eval_accuracy': 0.913988661430497, 'eval_precision': 0.1725452162312655, 'eval_recall': 0.8272126203209694, 'eval_f1': 0.28553230637278637, 'eval_auc': 0.8715212375759034, 'eval_mcc': 0.3539008454498742 ```