sberbank-ai commited on
Commit
ed49dcc
•
1 Parent(s): 41e97dc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -50,7 +50,7 @@ RUDOLPH 2.7B is a Transformer-based decoder model with the following parameters:
50
  * hidden\_size (2560) — Dimensionality of the hidden layers.
51
  * num\_attention\_heads (32) — Number of attention heads for each attention layer.
52
 
53
- # Sparse Attention Mask
54
 
55
  The primary proposed method is to modify the sparse transformer's attention mask to better control modalities. It allows us to calculate the transitions of modalities in both directions, unlike another similar work DALL-E Transformer, which used only one direction, "text to image". The proposed "image to right text" direction is achieved by extension sparse attention mask to the right for auto-repressively text generation with both image and left text condition.
56
 
 
50
  * hidden\_size (2560) — Dimensionality of the hidden layers.
51
  * num\_attention\_heads (32) — Number of attention heads for each attention layer.
52
 
53
+ # Sparse Attention Masks
54
 
55
  The primary proposed method is to modify the sparse transformer's attention mask to better control modalities. It allows us to calculate the transitions of modalities in both directions, unlike another similar work DALL-E Transformer, which used only one direction, "text to image". The proposed "image to right text" direction is achieved by extension sparse attention mask to the right for auto-repressively text generation with both image and left text condition.
56