Gabriel commited on
Commit
b19070a
1 Parent(s): 4358cae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -15
README.md CHANGED
@@ -40,21 +40,6 @@ model-index:
40
  A historical Swedish Bert model is released from the National Swedish Archives to better generalise to Swedish historical text. Researches are well-aware that the Swedish language has been subject to change over time which means that present-day point-of-view models less ideal candidates for the job.
41
  However, this model can be used to interpret and analyse historical textual material and be fine-tuned for different downstream tasks.
42
 
43
- ## Model Dscription
44
-
45
- The following hyperparameters were used during training:
46
- - learning_rate: 3e-05
47
- - train_batch_size: 8
48
- - eval_batch_size: 8
49
- - seed: 42
50
- - gradient_accumulation_steps: 0
51
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
52
- - lr_scheduler_type: linear
53
- - num_epochs: 6
54
- - fp16: False
55
-
56
- Dataset:
57
- - Khubist2, which has been cleaned and chunked
58
 
59
  ## Intended uses & limitations
60
  This model should primarly be used to fine-tune further on and downstream tasks.
@@ -70,8 +55,24 @@ print(summarizer(historical_text))
70
  ```
71
 
72
 
 
 
 
73
 
 
 
 
 
 
 
 
 
 
 
 
74
 
 
 
75
 
76
  ## Acknowledgements
77
  We gratefully acknowledge EuroHPC (https://eurohpc-ju.europa.eu) for funding this research by providing computing resources of the HPC system Vega at the Institute of Information Science (https://www.izum.si)
 
40
  A historical Swedish Bert model is released from the National Swedish Archives to better generalise to Swedish historical text. Researches are well-aware that the Swedish language has been subject to change over time which means that present-day point-of-view models less ideal candidates for the job.
41
  However, this model can be used to interpret and analyse historical textual material and be fine-tuned for different downstream tasks.
42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
44
  ## Intended uses & limitations
45
  This model should primarly be used to fine-tune further on and downstream tasks.
 
55
  ```
56
 
57
 
58
+ ## Model Dscription
59
+ The training procedure can be recreated from here: https://github.com/Borg93/kbuhist2/tree/main
60
+ The preprocessing procedure can be recreated from here: https://github.com/Borg93/kbuhist2/tree/main
61
 
62
+ ### Model
63
+ The following hyperparameters were used during training:
64
+ - learning_rate: 3e-05
65
+ - train_batch_size: 8
66
+ - eval_batch_size: 8
67
+ - seed: 42
68
+ - gradient_accumulation_steps: 0
69
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
70
+ - lr_scheduler_type: linear
71
+ - num_epochs: 6
72
+ - fp16: False
73
 
74
+ ### Dataset (WIP)
75
+ - Khubist2, which has been cleaned and chunked.
76
 
77
  ## Acknowledgements
78
  We gratefully acknowledge EuroHPC (https://eurohpc-ju.europa.eu) for funding this research by providing computing resources of the HPC system Vega at the Institute of Information Science (https://www.izum.si)