R and s denoising examples are switched?

by jstjohn - opened

I think the R denoising section (random denoising) describes S denoising (the text completion prefix one) and vice versa.

I am confused with the image:

Why the targets are shared in X-denoising (second part)? What does it signify?

I think the numbers represent the span length that is getting masked in each block. So for example the first two on the example on the far right just happen to be two length 3 spans of masked words.

+1 for the answer of @jstjohn !

jstjohn changed discussion status to closed

Noob question.

How come before (the first masked word in the example) has a span length of 3?

Sign up or log in to comment