francislabounty commited on
Commit
1ca95a8
1 Parent(s): 66d34a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -12,7 +12,7 @@ language:
12
  - Effective batch size: 128
13
  - Learning Rate: 2e-5 with linear decay
14
  - Epochs: 1
15
- - Base model trained with QLoRA (rank 64, alpha 16) and MoE adapters/routers trained in bf16
16
  - Num Experts: 16
17
  - Top K: 4
18
 
 
12
  - Effective batch size: 128
13
  - Learning Rate: 2e-5 with linear decay
14
  - Epochs: 1
15
+ - [Base model](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) trained with QLoRA (rank 64, alpha 16) and MoE adapters/routers trained in bf16
16
  - Num Experts: 16
17
  - Top K: 4
18