Update README.md
Browse files
README.md
CHANGED
@@ -23,7 +23,7 @@ This modelization is close to [BaptisteDoyen/camembert-base-xnli](https://huggin
|
|
23 |
Dataset
|
24 |
-------
|
25 |
|
26 |
-
The dataset XNLI from [FLUE](https://huggingface.co/datasets/flue) is composed of 392,702 premises with their hypothesis for the train and 5,010 couples for the test. The goal is to predict textual entailment (does sentence A imply/contradict/neither sentence B) and is a classification task (given two sentences, predict one of three labels). The sentence A is called *premise* and sentence B is called *hypothesis*, then the goal of modelization is determined :
|
27 |
$$P(premise\in\{contradiction, entailment, neutral\}\vert hypothesis)$$
|
28 |
|
29 |
Evaluation results
|
@@ -39,7 +39,7 @@ Evaluation results
|
|
39 |
Benchmark
|
40 |
---------
|
41 |
|
42 |
-
We compare the [DistilCamemBERT](https://huggingface.co/cmarkea/distilcamembert-base) model with 2
|
43 |
|
44 |
| **NLI** | **time (ms)** | **MCC (x100)** |
|
45 |
| :--------------: | :-----------: | :------------: |
|
@@ -47,7 +47,7 @@ We compare the [DistilCamemBERT](https://huggingface.co/cmarkea/distilcamembert-
|
|
47 |
| [BaptisteDoyen/camembert-base-xnli](https://huggingface.co/BaptisteDoyen/camembert-base-xnli) | 105.0 | 72.67 |
|
48 |
| [MoritzLaurer/mDeBERTa-v3-base-mnli-xnli](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-mnli-xnli) | 299.18 | 75.15 |
|
49 |
|
50 |
-
|
51 |
$$P(hypothesis=c|premise)=\frac{e^{P(premise=entailment\vert hypothesis\; c)}}{\sum_{i\in\mathcal{C}}e^{P(premise=entailment\vert hypothesis\; i)}}$$
|
52 |
|
53 |
| **Allociné** | **time (ms)** | **MCC (x100)** |
|
@@ -77,16 +77,12 @@ result = classifier (
|
|
77 |
hypothesis_template="Ce texte parle de {}."
|
78 |
)
|
79 |
result
|
80 |
-
{"labels": [
|
81 |
-
|
82 |
-
|
83 |
-
|
84 |
-
|
85 |
-
|
86 |
-
|
87 |
-
|
88 |
-
0.2278652936220169,
|
89 |
-
0.17426978051662445,
|
90 |
-
0.08065623790025711
|
91 |
-
]}
|
92 |
```
|
|
|
23 |
Dataset
|
24 |
-------
|
25 |
|
26 |
+
The dataset XNLI from [FLUE](https://huggingface.co/datasets/flue) is composed of 392,702 premises with their hypothesis for the train and 5,010 couples for the test. The goal is to predict textual entailment (does sentence A imply/contradict/neither sentence B?) and is a classification task (given two sentences, predict one of three labels). The sentence A is called *premise* and sentence B is called *hypothesis*, then the goal of modelization is determined as follows:
|
27 |
$$P(premise\in\{contradiction, entailment, neutral\}\vert hypothesis)$$
|
28 |
|
29 |
Evaluation results
|
|
|
39 |
Benchmark
|
40 |
---------
|
41 |
|
42 |
+
We compare the [DistilCamemBERT](https://huggingface.co/cmarkea/distilcamembert-base) model with 2 other modelizations working on french language. The first one [BaptisteDoyen/camembert-base-xnli](https://huggingface.co/BaptisteDoyen/camembert-base-xnli) is based on well named [CamemBERT](https://huggingface.co/camembert-base), the french RoBERTa model and the second one [MoritzLaurer/mDeBERTa-v3-base-mnli-xnli](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-mnli-xnli) based on [mDeBERTav3](https://huggingface.co/microsoft/mdeberta-v3-base) a multilingual model. To compare the performances the metric [MCC (Matthews Correlation Coefficient)](https://en.wikipedia.org/wiki/Phi_coefficient) was used and for the mean inference time measure, an **AMD Ryzen 5 4500U @ 2.3GHz with 6 cores** was used:
|
43 |
|
44 |
| **NLI** | **time (ms)** | **MCC (x100)** |
|
45 |
| :--------------: | :-----------: | :------------: |
|
|
|
47 |
| [BaptisteDoyen/camembert-base-xnli](https://huggingface.co/BaptisteDoyen/camembert-base-xnli) | 105.0 | 72.67 |
|
48 |
| [MoritzLaurer/mDeBERTa-v3-base-mnli-xnli](https://huggingface.co/MoritzLaurer/mDeBERTa-v3-base-mnli-xnli) | 299.18 | 75.15 |
|
49 |
|
50 |
+
The main advantage of such modelization is to create a zero-shot classifier allowing text classification without training. This task can be summarized by:
|
51 |
$$P(hypothesis=c|premise)=\frac{e^{P(premise=entailment\vert hypothesis\; c)}}{\sum_{i\in\mathcal{C}}e^{P(premise=entailment\vert hypothesis\; i)}}$$
|
52 |
|
53 |
| **Allociné** | **time (ms)** | **MCC (x100)** |
|
|
|
77 |
hypothesis_template="Ce texte parle de {}."
|
78 |
)
|
79 |
result
|
80 |
+
{"labels": ["cinéma",
|
81 |
+
"technologie",
|
82 |
+
"littérature",
|
83 |
+
"politique"],
|
84 |
+
"scores": [0.5172086954116821,
|
85 |
+
0.2278652936220169,
|
86 |
+
0.17426978051662445,
|
87 |
+
0.08065623790025711]}
|
|
|
|
|
|
|
|
|
88 |
```
|