AmelieSchreiber commited on
Commit
287b523
1 Parent(s): 7e9a955

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -8,7 +8,9 @@ In this model we added in more QLoRA adapter layers, modifying all of the weight
8
  train and test metrics, again, are smaller for this model than for the model with fewer adapter layers (only using query, key, and value
9
  matrices). So, we see that adapting more of the weight matrices in this larger ESM-2 model decreases overfitting and serves as a better
10
  regularizer. For comparison, see [this model](https://huggingface.co/AmelieSchreiber/esm2_t12_35M_qlora_binding_sites_v0) which only
11
- has QLoRA adapters on the query, key, and value matrices.
 
 
12
 
13
  ## Testing for Overfitting
14
 
 
8
  train and test metrics, again, are smaller for this model than for the model with fewer adapter layers (only using query, key, and value
9
  matrices). So, we see that adapting more of the weight matrices in this larger ESM-2 model decreases overfitting and serves as a better
10
  regularizer. For comparison, see [this model](https://huggingface.co/AmelieSchreiber/esm2_t12_35M_qlora_binding_sites_v0) which only
11
+ has QLoRA adapters on the query, key, and value matrices. This model was trained on [this dataset](https://huggingface.co/datasets/AmelieSchreiber/1111K_binding_sites).
12
+ Note, this dataset is too small for this model, so overfitting is expected, but overfitting is clearly reduced by including more adapter
13
+ layers in the QLoRA.
14
 
15
  ## Testing for Overfitting
16