deepmind
/

multimodal-perceiver

Model card Files Files and versions Community

nielsr HF Staff commited on Dec 11, 2021

Commit

bbfaf82

·

1 Parent(s): d3bb7bf

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -23,11 +23,11 @@ To decode, the authors employ so-called decoder queries, which allow to flexibly
 <small> Perceiver IO architecture.</small>
-As the time and memory requirements of the self-attention mechanism don't depend on the size of the inputs, the Perceiver IO authors can train the model by padding the inputs with modality-specific embeddings and serialize all of them into a 2D input array (i.e. concatenate along the time dimension). Decoding the final hidden states of the latents is done by using queries containing Fourier-based position embeddings (for video and audio) and modality embeddings.
 ## Intended uses & limitations
-You can use the raw model for multimodal autoencoding. Note that by masking the classification label during evaluation, the auto-encoding model becomes a video classifier.
 See the [model hub](https://huggingface.co/models search=deepmind/perceiver) to look for other versions on a task that may interest you.

 <small> Perceiver IO architecture.</small>
+As the time and memory requirements of the self-attention mechanism don't depend on the size of the inputs, the Perceiver IO authors can train the model by padding the inputs (images, audio, class label) with modality-specific embeddings and serialize all of them into a 2D input array (i.e. concatenate along the time dimension). Decoding the final hidden states of the latents is done by using queries containing Fourier-based position embeddings (for video and audio) and modality embeddings.
 ## Intended uses & limitations
+You can use the raw model for multimodal autoencoding. Note that by masking the class label during evaluation, the auto-encoding model becomes a video classifier.
 See the [model hub](https://huggingface.co/models search=deepmind/perceiver) to look for other versions on a task that may interest you.