google
/

fnet-large

@@ -17,7 +17,7 @@ Disclaimer: This model card has been written by [gchhablani](https://huggingface
 ## Model description
-FNet is a transformers model with attention replaced with fourier transforms. It is pretrained on a large corpus of
 English data in a self-supervised fashion. This means it was pretrained on the raw texts only, with no humans labelling
 them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and
 labels from those texts. More precisely, it was pretrained with two objectives:
@@ -53,6 +53,8 @@ generation you should look at model like GPT2.
 You can use this model directly with a pipeline for masked language modeling:
 ```python
 >>> from transformers import FNetForMaskedLM, FNetTokenizer, pipeline
 >>> tokenizer = FNetTokenizer.from_pretrained("google/fnet-large")
@@ -72,12 +74,14 @@ You can use this model directly with a pipeline for masked language modeling:
 Here is how to use this model to get the features of a given text in PyTorch:
 ```python
 from transformers import FNetTokenizer, FNetModel
 tokenizer = FNetTokenizer.from_pretrained("google/fnet-large")
 model = FNetModel.from_pretrained("google/fnet-large")
 text = "Replace me by any text you'd like."
-encoded_input = tokenizer(text, return_tensors='pt')
 output = model(**encoded_input)
 ```
@@ -176,4 +180,7 @@ Glue test results:
   biburl    = {https://dblp.org/rec/journals/corr/abs-2105-03824.bib},
   bibsource = {dblp computer science bibliography, https://dblp.org}
 }
-```

 ## Model description
+FNet is a transformers model with attention replaced with fourier transforms. Hence, the inputs do not contain an `attention_mask`. It is pretrained on a large corpus of
 English data in a self-supervised fashion. This means it was pretrained on the raw texts only, with no humans labelling
 them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and
 labels from those texts. More precisely, it was pretrained with two objectives:
 You can use this model directly with a pipeline for masked language modeling:
+**Note: The mask filling pipeline doesn't work exactly as the original model performs masking after converting to tokens. In masking pipeline an additional space is added after the [MASK].**
 ```python
 >>> from transformers import FNetForMaskedLM, FNetTokenizer, pipeline
 >>> tokenizer = FNetTokenizer.from_pretrained("google/fnet-large")
 Here is how to use this model to get the features of a given text in PyTorch:
+**Note: You must specify the maximum sequence length to be 512 and truncate/pad to the same length because the original model has no attention mask and considers all the hidden states during forward pass.**
 ```python
 from transformers import FNetTokenizer, FNetModel
 tokenizer = FNetTokenizer.from_pretrained("google/fnet-large")
 model = FNetModel.from_pretrained("google/fnet-large")
 text = "Replace me by any text you'd like."
+encoded_input = tokenizer(text, return_tensors='pt', padding='max_length', truncation=True, max_length=512)
 output = model(**encoded_input)
 ```
   biburl    = {https://dblp.org/rec/journals/corr/abs-2105-03824.bib},
   bibsource = {dblp computer science bibliography, https://dblp.org}
 }
+```
+## Contributions
+Thanks to [@gchhablani](https://huggingface.co/gchhablani). for adding this model.