furkanakkurt1618
commited on
Commit
•
b8f6049
1
Parent(s):
9863898
Fix grammar
Browse files
README.md
CHANGED
@@ -71,7 +71,7 @@ whereas the following abbreviations are used:
|
|
71 |
| sh | Signifies that attention heads are shared |
|
72 |
| skv | Signifies that key-values projection matrices are tied |
|
73 |
|
74 |
-
If a model checkpoint has no specific
|
75 |
|
76 |
## Pre-Training
|
77 |
|
|
|
71 |
| sh | Signifies that attention heads are shared |
|
72 |
| skv | Signifies that key-values projection matrices are tied |
|
73 |
|
74 |
+
If a model checkpoint has no specific *el* or *dl*, then both of the numbers of encoder and decoder layers correspond to *nl*.
|
75 |
|
76 |
## Pre-Training
|
77 |
|