R and s denoising examples are switched?
#1
by
jstjohn
- opened
I think the R denoising section (random denoising) describes S denoising (the text completion prefix one) and vice versa.
For example https://huggingface.co/google/ul2#r-denoising
I think the numbers represent the span length that is getting masked in each block. So for example the first two on the example on the far right just happen to be two length 3 spans of masked words.
+1 for the answer of @jstjohn !
jstjohn
changed discussion status to
closed
Noob question.
How come before
(the first masked word in the example) has a span length of 3?