lsmille
/

lora_evo_ta_all_layers_17

Generated from Trainer

Model card Files Files and versions Community

lsmille commited on May 31, 2024

Commit

2772d1f

·

verified ·

1 Parent(s): e11479d

Update README.md

Files changed (1) hide show

README.md +23 -1

README.md CHANGED Viewed

@@ -20,7 +20,29 @@ It achieves the following results on the evaluation set:
 ## Model description
-More information needed
 ## Intended uses & limitations

 ## Model description
+Trained on single ID token 5K dataset filtered to 10k sequences (30% for test data = 3000)
+lora_alpha = 64 <--------------
+lora_dropout = 0.1
+lora_r = 128
+epochs = 3
+learning rate = 3e-4
+warmup_steps=500
+gradient_accumulation_steps = 1
+train_batch = 2
+eval_batch = 2
+ALL Linear layers
+Changed ' token to >     <--------------
 ## Intended uses & limitations