Cyrile commited on
Commit
a26eab8
1 Parent(s): f140f84

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -22,7 +22,7 @@ The training for the distilled model (student model) is designed to be the close
22
 
23
  The final loss function is a combination of these three losses functions. We use the following ponderation:
24
 
25
- $$Loss = 0.5 \times DistilLoss + 0.3 \times CosineLoss$$ + 0.2 \times MLMLoss
26
 
27
  Dataset
28
  -------
 
22
 
23
  The final loss function is a combination of these three losses functions. We use the following ponderation:
24
 
25
+ $$Loss = 0.5 \times DistilLoss + 0.3 \times CosineLoss + 0.2 \times MLMLoss$$
26
 
27
  Dataset
28
  -------