hassan4830
/

xlm-roberta-base-finetuned-urdu

Text Classification

Inference Endpoints

Model card Files Files and versions Community

hassan4830 commited on Jul 25, 2022

Commit

6bbd573

•

1 Parent(s): 285fa31

Update README.md

Files changed (1) hide show

README.md +5 -17

README.md CHANGED Viewed

@@ -14,23 +14,11 @@ This [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) text classifica
 ## Model description
-DistilBERT is a transformers model, smaller and faster than BERT, which was pretrained on the same corpus in a
-self-supervised fashion, using the BERT base model as a teacher. This means it was pretrained on the raw texts only,
-with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic
-process to generate inputs and labels from those texts using the BERT base model. More precisely, it was pretrained
-with three objectives:
-- Distillation loss: the model was trained to return the same probabilities as the BERT base model.
-- Masked language modeling (MLM): this is part of the original training loss of the BERT base model. When taking a
-  sentence, the model randomly masks 15% of the words in the input then run the entire masked sentence through the
-  model and has to predict the masked words. This is different from traditional recurrent neural networks (RNNs) that
-  usually see the words one after the other, or from autoregressive models like GPT which internally mask the future
-  tokens. It allows the model to learn a bidirectional representation of the sentence.
-- Cosine embedding loss: the model was also trained to generate hidden states as close as possible as the BERT base
-  model.
-This way, the model learns the same inner representation of the English language than its teacher model, while being
-faster for inference or downstream tasks.
 ## Intended uses & limitations

 ## Model description
+XLM-RoBERTa is a scaled cross-lingual sentence encoder. It is trained on 2.5T of data across 100 languages data filtered from Common Crawl. XLM-R achieves state-of-the-arts results on multiple cross-lingual benchmarks.
+The XLM-RoBERTa model was proposed in Unsupervised Cross-lingual Representation Learning at Scale by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco GuzmÃ¡n, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov.
+It is based on Facebook’s RoBERTa model released in 2019. It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data.
 ## Intended uses & limitations