Cyrile commited on
Commit
c5c1d8a
1 Parent(s): 176f5da

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -32,7 +32,7 @@ Where sigma is the sigmoid function and O represents the set of learning observa
32
  Benchmark
33
  ---------
34
 
35
- As the scores range from 0 to 1, a performance measure such as MAE or RMSE may be challenging to interpret. Therefore, Pearson's inter-correlation was chosen as a measure. Pearson's inter-correlation is a measure ranging from -1 to 1, where 0 represents no correlation, -1 represents perfect negative correlation, and 1 represents perfect positive correlation. The goal is to quantitatively measure the correlation between the model's scores and the scores assigned by judges for 750 comments not seen during training.
36
 
37
  | Model | Language | Obsecene (x100) | Sexual explicit (x100) | Identity attack (x100) | Insult (x100) | Threat (x100) | Mean |
38
  |-------------------------------------------------------------------------------|----------|:-----------------------:|-------------------------------|-------------------------------|----------------------|----------------------|------|
@@ -43,6 +43,15 @@ As the scores range from 0 to 1, a performance measure such as MAE or RMSE may b
43
 
44
  With a correlation of approximately 65 for the 560m model and approximately 80 for the 3b model, the output is highly correlated with the judges' scores.
45
 
 
 
 
 
 
 
 
 
 
46
  How to Use Blommz-3b-guardrail
47
  --------------------------------
48
 
 
32
  Benchmark
33
  ---------
34
 
35
+ As the scores range from 0 to 1, a performance measure such as RMSE may be challenging to interpret. Therefore, Pearson's inter-correlation was chosen as a measure. Pearson's inter-correlation is a measure ranging from -1 to 1, where 0 represents no correlation, -1 represents perfect negative correlation, and 1 represents perfect positive correlation. The goal is to quantitatively measure the correlation between the model's scores and the scores assigned by judges for 730 comments not seen during training.
36
 
37
  | Model | Language | Obsecene (x100) | Sexual explicit (x100) | Identity attack (x100) | Insult (x100) | Threat (x100) | Mean |
38
  |-------------------------------------------------------------------------------|----------|:-----------------------:|-------------------------------|-------------------------------|----------------------|----------------------|------|
 
43
 
44
  With a correlation of approximately 65 for the 560m model and approximately 80 for the 3b model, the output is highly correlated with the judges' scores.
45
 
46
+ Now we will focus on the MAE (Mean Absolute Error) score to measure the average gap of the estimation error.
47
+
48
+ | Model | Language | Obsecene | Sexual explicit | Identity attack | Insult | Threat | Mean |
49
+ |-------------------------------------------------------------------------------|----------|:----------------:|-----------------------|----------------------|--------------|------------|------|
50
+ | [Bloomz-560m-guardrail](https://huggingface.co/cmarkea/bloomz-560m-guardrail) | French | 0.06 | 0.03 | 0.03 | 0.13 | 0.04 | 0.06 |
51
+ | [Bloomz-560m-guardrail](https://huggingface.co/cmarkea/bloomz-560m-guardrail) | English | 0.06 | 0.03 | 0.03 | 0.14 | 0.04 | 0.06 |
52
+ | [Bloomz-3b-guardrail](https://huggingface.co/cmarkea/bloomz-3b-guardrail) | French | 0.05 | 0.02 | 0.02 | 0.11 | 0.03 | 0.05 |
53
+ | [Bloomz-3b-guardrail](https://huggingface.co/cmarkea/bloomz-3b-guardrail) | English | 0.05 | 0.03 | 0.02 | 0.12 | 0.03 | 0.05 |
54
+
55
  How to Use Blommz-3b-guardrail
56
  --------------------------------
57