jbochi commited on
Commit
14afdcf
1 Parent(s): 1765fef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -443,6 +443,7 @@ These are the main differences relative to the original T5 architecture:
443
  - Shared Input-Output Embeddings
444
  - No biases
445
  - Bidirectional attention
 
446
 
447
  If you are looking for the language models models, here are the available versions:
448
  - [3B](https://huggingface.co/jbochi/madlad400-3b-mt)
 
443
  - Shared Input-Output Embeddings
444
  - No biases
445
  - Bidirectional attention
446
+ - Layer Norm with `center_scale_at_zero` and final layer with `use_scale=False`
447
 
448
  If you are looking for the language models models, here are the available versions:
449
  - [3B](https://huggingface.co/jbochi/madlad400-3b-mt)