MarcosDib commited on
Commit
f08495b
1 Parent(s): d6b86d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -2
README.md CHANGED
@@ -62,8 +62,8 @@ With the motivation to increase accuracy obtained with baseline implementation,
62
  strategy under the assumption that small data available for training was insufficient for adequate embedding training.
63
  In this context, we considered two approaches:
64
 
65
- i) pre-training wordembeddings using similar datasets for text classification;
66
- ii) using transformers and attention mechanisms (Longformer) to create contextualized embeddings.
67
 
68
  XXXX has originally been released in base and large variations, for cased and uncased input text. The uncased models
69
  also strips out an accent markers. Chinese and multilingual uncased and cased versions followed shortly after.
@@ -82,6 +82,14 @@ The detailed release history can be found on the [here](https://huggingface.co/u
82
  | [`mcti-large-cased`] | 110M | Chinese |
83
  | [`-base-multilingual-cased`] | 110M | Multiple |
84
 
 
 
 
 
 
 
 
 
85
  ## Intended uses
86
 
87
  You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to
 
62
  strategy under the assumption that small data available for training was insufficient for adequate embedding training.
63
  In this context, we considered two approaches:
64
 
65
+ i) pre-training wordembeddings using similar datasets for text classification;
66
+ ii) using transformers and attention mechanisms (Longformer) to create contextualized embeddings.
67
 
68
  XXXX has originally been released in base and large variations, for cased and uncased input text. The uncased models
69
  also strips out an accent markers. Chinese and multilingual uncased and cased versions followed shortly after.
 
82
  | [`mcti-large-cased`] | 110M | Chinese |
83
  | [`-base-multilingual-cased`] | 110M | Multiple |
84
 
85
+ | Dataset | Compatibility to base* |
86
+ |----------------------|------------------------|
87
+ | Labeled MCTI | 100% |
88
+ | Full MCTI | 100% |
89
+ | BBC News Articles | 56.77% |
90
+ | New unlabeled MCTI | 75.26% |
91
+
92
+
93
  ## Intended uses
94
 
95
  You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to