revert back of grammarly fix on readme.md

the last contribution introduced the changes in README.md which were not supposed to be there. Reverting the readme file back to the previous commit.

Files changed (1) hide show

README.md +7 -6

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ datasets:
 ## Table of Contents
 - [Model Details](#model-details)
 - [Uses](#uses)
-- [Risks, Limitations, and Biases](#risks-limitations-and-biases)
 - [Training](#training)
 - [Evaluation](#evaluation)
 - [Citation Information](#citation-information)
@@ -20,7 +20,7 @@ datasets:
 ## Model Details
 - **Model Description:**
 CamemBERT is a state-of-the-art language model for French based on the RoBERTa model.
-It is now available on Hugging Face in 6 different versions with varying numbers of parameters, amount of pretraining data, and pretraining data source domains.
 - **Developed by:**  Louis Martin\*, Benjamin Muller\*, Pedro Javier Ortiz Suárez\*, Yoann Dupont, Laurent Romary, Éric Villemonte de la Clergerie, Djamé Seddah and Benoît Sagot.
 - **Model Type:** Fill-Mask
 - **Language(s):** French
@@ -38,7 +38,7 @@ It is now available on Hugging Face in 6 different versions with varying numbers
 This model can be used for Fill-Mask tasks.
-## Risks, Limitations, and Biases
 **CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propagate historical and current stereotypes.**
 Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)).
@@ -72,7 +72,7 @@ OSCAR or Open Super-large Crawled Aggregated coRpus is a multilingual corpus obt
 ## Evaluation
-The model developers evaluated CamemBERT using four different downstream tasks for French: part-of-speech (POS) tagging, dependency parsing, named entity recognition (NER), and natural language inference (NLI).
@@ -81,7 +81,7 @@ The model developers evaluated CamemBERT using four different downstream tasks f
 ```bibtex
 @inproceedings{martin2020camembert,
   title={CamemBERT: a Tasty French Language Model},
-  author={Martin, Louis and Muller, Benjamin, and Su{\'a}rez, Pedro Javier Ortiz and Dupont, Yoann and Romary, Laurent and de la Clergerie, {\'E}ric Villemonte and Seddah, Djam{\'e} and Sagot, Beno{\^\i}t},
   booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
   year={2020}
 }
@@ -126,7 +126,7 @@ tokenized_sentence = tokenizer.tokenize("J'aime le camembert !")
 # 1-hot encode and add special starting and end tokens
 encoded_sentence = tokenizer.encode(tokenized_sentence)
 # [5, 121, 11, 660, 16, 730, 25543, 110, 83, 6]
-# NB: Can be done in one step: tokenize.encode("J'aime le camembert !")
 # Feed tokens to Camembert as a torch tensor (batch dim 1)
 encoded_sentence = torch.tensor(encoded_sentence).unsqueeze(0)
@@ -155,3 +155,4 @@ all_layer_embeddings[5]
 #         [ 0.0557, -0.0588,  0.0547,  ..., -0.0726, -0.0867,  0.0699],
 #         ...,
 ```

 ## Table of Contents
 - [Model Details](#model-details)
 - [Uses](#uses)
+- [Risks, Limitations and Biases](#risks-limitations-and-biases)
 - [Training](#training)
 - [Evaluation](#evaluation)
 - [Citation Information](#citation-information)
 ## Model Details
 - **Model Description:**
 CamemBERT is a state-of-the-art language model for French based on the RoBERTa model.
+It is now available on Hugging Face in 6 different versions with varying number of parameters, amount of pretraining data and pretraining data source domains.
 - **Developed by:**  Louis Martin\*, Benjamin Muller\*, Pedro Javier Ortiz Suárez\*, Yoann Dupont, Laurent Romary, Éric Villemonte de la Clergerie, Djamé Seddah and Benoît Sagot.
 - **Model Type:** Fill-Mask
 - **Language(s):** French
 This model can be used for Fill-Mask tasks.
+## Risks, Limitations and Biases
 **CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propagate historical and current stereotypes.**
 Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)).
 ## Evaluation
+The model developers evaluated CamemBERT using four different downstream tasks for French: part-of-speech (POS) tagging, dependency parsing, named entity recognition (NER) and natural language inference (NLI).
 ```bibtex
 @inproceedings{martin2020camembert,
   title={CamemBERT: a Tasty French Language Model},
+  author={Martin, Louis and Muller, Benjamin and Su{\'a}rez, Pedro Javier Ortiz and Dupont, Yoann and Romary, Laurent and de la Clergerie, {\'E}ric Villemonte and Seddah, Djam{\'e} and Sagot, Beno{\^\i}t},
   booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
   year={2020}
 }
 # 1-hot encode and add special starting and end tokens
 encoded_sentence = tokenizer.encode(tokenized_sentence)
 # [5, 121, 11, 660, 16, 730, 25543, 110, 83, 6]
+# NB: Can be done in one step : tokenize.encode("J'aime le camembert !")
 # Feed tokens to Camembert as a torch tensor (batch dim 1)
 encoded_sentence = torch.tensor(encoded_sentence).unsqueeze(0)
 #         [ 0.0557, -0.0588,  0.0547,  ..., -0.0726, -0.0867,  0.0699],
 #         ...,
 ```