ficsort commited on
Commit
f09778d
1 Parent(s): 2519a5b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -27
README.md CHANGED
@@ -1,43 +1,31 @@
1
  ---
 
 
 
 
2
  tags:
3
  - generated_from_keras_callback
 
4
  model-index:
5
  - name: hubert-medium-wiki
6
  results: []
7
  ---
8
 
9
- <!-- This model card has been generated automatically according to the information Keras had access to. You should
10
- probably proofread and complete it, then remove this comment. -->
11
-
12
  # hubert-medium-wiki
13
 
14
- This model was trained from scratch on an unknown dataset.
15
- It achieves the following results on the evaluation set:
16
-
17
-
18
- ## Model description
19
-
20
- More information needed
21
-
22
- ## Intended uses & limitations
23
-
24
- More information needed
25
 
26
- ## Training and evaluation data
27
-
28
- More information needed
29
-
30
- ## Training procedure
31
-
32
- ### Training hyperparameters
33
-
34
- The following hyperparameters were used during training:
35
- - optimizer: None
36
- - training_precision: float32
37
-
38
- ### Training results
39
 
 
 
 
 
40
 
 
 
 
 
41
 
42
  ### Framework versions
43
 
@@ -45,3 +33,6 @@ The following hyperparameters were used during training:
45
  - TensorFlow 2.10.0
46
  - Datasets 2.4.0
47
  - Tokenizers 0.12.1
 
 
 
 
1
  ---
2
+ language: hu
3
+ license: apache-2.0
4
+ datasets:
5
+ - wikipedia
6
  tags:
7
  - generated_from_keras_callback
8
+ - hubert
9
  model-index:
10
  - name: hubert-medium-wiki
11
  results: []
12
  ---
13
 
 
 
 
14
  # hubert-medium-wiki
15
 
16
+ This model was trained from scratch on the Wikipedia subset of Hungarian Webcorpus 2.0 with MLM and SOP tasks.
 
 
 
 
 
 
 
 
 
 
17
 
18
+ ### Pre-Training Parameters:
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
+ First phase:
21
+ - Training steps: 500.000
22
+ - Sequence length: 128
23
+ - Batch size: 1024
24
 
25
+ Second phase:
26
+ - Training steps: 100.000
27
+ - Sequence length: 512
28
+ - Batch size: 384
29
 
30
  ### Framework versions
31
 
 
33
  - TensorFlow 2.10.0
34
  - Datasets 2.4.0
35
  - Tokenizers 0.12.1
36
+
37
+ # Acknowledgement
38
+ [![Artificial Intelligence - National Laboratory - Hungary](https://milab.tk.hu/uploads/images/milab_logo_en.png)](https://mi.nemzetilabor.hu/)