juliensimon
/

xlm-v-base-language-id

Text Classification

Generated from Trainer

language-identification

Inference Endpoints

Model card Files Files and versions Community

juliensimon commited on Feb 10, 2023

Commit

d322b88

·

1 Parent(s): 8e2a7b5

Update README.md

Files changed (1) hide show

README.md +36 -3

README.md CHANGED Viewed

@@ -3,6 +3,7 @@ license: mit
 tags:
 - generated_from_trainer
 - language-identification
 datasets:
 - fleurs
 metrics:
@@ -23,6 +24,7 @@ model-index:
     - name: Accuracy
       type: accuracy
       value: 0.9930337861372344
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -30,14 +32,45 @@ should probably proofread and complete it, then remove this comment. -->
 # xlm-v-base-language-id
-This model is a fine-tuned version of [facebook/xlm-v-base](https://huggingface.co/facebook/xlm-v-base) on the fleurs dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.0241
 - Accuracy: 0.9930
 ## Intended uses & limitations
-The model can accurately detect 102 languages.
 ## Training and evaluation data
@@ -78,4 +111,4 @@ The following hyperparameters were used during training:
 - Transformers 4.26.0
 - Pytorch 1.13.1
 - Datasets 2.8.0
-- Tokenizers 0.13.2

 tags:
 - generated_from_trainer
 - language-identification
+- openvino
 datasets:
 - fleurs
 metrics:
     - name: Accuracy
       type: accuracy
       value: 0.9930337861372344
+pipeline_tag: text-classification
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # xlm-v-base-language-id
+This model is a fine-tuned version of [facebook/xlm-v-base](https://huggingface.co/facebook/xlm-v-base) on the [google/fleurs](https://huggingface.co/datasets/google/fleurs) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.0241
 - Accuracy: 0.9930
+# Usage
+The simplest way to use the model is with a text classification pipeline:
+```
+from transformers import pipeline
+model_id = "juliensimon/xlm-v-base-language-id"
+p = pipeline("text-classification", model=model_id)
+p("Hello world")
+# [{'label': 'English', 'score': 0.9802148342132568}]
+```
+The model is also compatible with [Optimum Intel](https://github.com/huggingface/optimum-intel).
+For example, you can optimize it with Intel OpenVINO and enjoy a 2x inference speedup (or more).
+```
+from optimum.intel.openvino import OVModelForSequenceClassification
+from transformers import (AutoModelForSequenceClassification, AutoTokenizer,
+                          pipeline)
+model_id = "juliensimon/xlm-v-base-language-id"
+ov_model = OVModelForSequenceClassification.from_pretrained(
+    model_id, from_transformers=True
+)
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+p = pipeline("text-classification", model=ov_model, tokenizer=tokenizer)
+p("Hello world")
+# [{'label': 'English', 'score': 0.9802149534225464}]
+```
 ## Intended uses & limitations
+The model can accurately detect 102 languages. You can find the list on the [dataset](https://huggingface.co/datasets/google/fleurs) page.
 ## Training and evaluation data
 - Transformers 4.26.0
 - Pytorch 1.13.1
 - Datasets 2.8.0
+- Tokenizers 0.13.2