deepvk
/

roberta-base

Feature Extraction

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

zemerov commited on Jul 27, 2023

Commit

7f770c3

•

1 Parent(s): f91371a

Update README.md

Files changed (1) hide show

README.md +12 -11

README.md CHANGED Viewed

@@ -65,17 +65,18 @@ Model was trained using 8xA100 for ~22 days.
 Standard RoBERTa-base parameters:
-| Argument                | Value |
-|-------------------------|-------|
-|Activation function      | gelu  |
-|Attention dropout        | 0.1   |
-|Dropout                  | 0.1   |
-|Encoder attention heads  | 12    |
-|Encoder embed dim        | 768   |
-|Encoder ffn embed dim    | 3,072 |
-|Encoder layers           | 12    |
-|Max positions            | 512   |
-|Vocab size               | 50266 |
 ## Evaluation

 Standard RoBERTa-base parameters:
+| Argument                | Value          |
+|-------------------------|----------------|
+|Activation function      | gelu           |
+|Attention dropout        | 0.1            |
+|Dropout                  | 0.1            |
+|Encoder attention heads  | 12             |
+|Encoder embed dim        | 768            |
+|Encoder ffn embed dim    | 3,072          |
+|Encoder layers           | 12             |
+|Max positions            | 512            |
+|Vocab size               | 50266          |
+|Tokenizer type           | Bete-level BPE |
 ## Evaluation