Update README.md
Browse files
README.md
CHANGED
@@ -21,7 +21,7 @@ The fine-tuning script can be accessed [here](Link).
|
|
21 |
|
22 |
- **Developed by:** [Juri Grosjean](https://huggingface.co/jgrosjean)
|
23 |
- **Model type:** [XMOD](https://huggingface.co/facebook/xmod-base)
|
24 |
-
- **Language(s) (NLP):**
|
25 |
- **License:** [More Information Needed]
|
26 |
- **Finetuned from model:** [SwissBERT](https://huggingface.co/ZurichNLP/swissbert)
|
27 |
|
@@ -70,32 +70,12 @@ tensor([[ 5.6306e-02, -2.8375e-01, -4.1495e-02, 7.4393e-02, -3.1552e-01,
|
|
70 |
...]])
|
71 |
```
|
72 |
|
73 |
-
[More Information Needed]
|
74 |
-
|
75 |
-
### Downstream Use [optional]
|
76 |
-
|
77 |
-
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
78 |
-
|
79 |
-
[More Information Needed]
|
80 |
-
|
81 |
## Bias, Risks, and Limitations
|
82 |
|
83 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
84 |
-
|
85 |
This multilingual model has not been fine-tuned for cross-lingual transfer. It is intended for computing sentence embeddings that can be compared mono-lingually.
|
86 |
|
87 |
-
### Recommendations
|
88 |
-
|
89 |
-
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
|
90 |
-
|
91 |
-
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
|
92 |
-
|
93 |
-
## How to Get Started with the Model
|
94 |
-
|
95 |
-
Use the code below to get started with the model.
|
96 |
-
|
97 |
-
[More Information Needed]
|
98 |
-
|
99 |
## Training Details
|
100 |
|
101 |
### Training Data
|
@@ -115,11 +95,24 @@ Use the code below to get started with the model.
|
|
115 |
|
116 |
#### Training Hyperparameters
|
117 |
|
118 |
-
- **Training regime:**
|
119 |
-
|
120 |
-
|
121 |
-
|
122 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
123 |
|
124 |
[More Information Needed]
|
125 |
|
@@ -155,12 +148,6 @@ Use the code below to get started with the model.
|
|
155 |
|
156 |
|
157 |
|
158 |
-
## Model Examination [optional]
|
159 |
-
|
160 |
-
<!-- Relevant interpretability work for the model goes here -->
|
161 |
-
|
162 |
-
[More Information Needed]
|
163 |
-
|
164 |
## Environmental Impact
|
165 |
|
166 |
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
|
|
21 |
|
22 |
- **Developed by:** [Juri Grosjean](https://huggingface.co/jgrosjean)
|
23 |
- **Model type:** [XMOD](https://huggingface.co/facebook/xmod-base)
|
24 |
+
- **Language(s) (NLP):** de_CH, fr_CH, it_CH, rm_CH
|
25 |
- **License:** [More Information Needed]
|
26 |
- **Finetuned from model:** [SwissBERT](https://huggingface.co/ZurichNLP/swissbert)
|
27 |
|
|
|
70 |
...]])
|
71 |
```
|
72 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
73 |
## Bias, Risks, and Limitations
|
74 |
|
75 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
76 |
+
This model has been trained on news articles only. Hence, it might not perform as well on other text classes.
|
77 |
This multilingual model has not been fine-tuned for cross-lingual transfer. It is intended for computing sentence embeddings that can be compared mono-lingually.
|
78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
## Training Details
|
80 |
|
81 |
### Training Data
|
|
|
95 |
|
96 |
#### Training Hyperparameters
|
97 |
|
98 |
+
- **Training regime:** python3 train_simcse_multilingual.py \
|
99 |
+
--seed 54699 \
|
100 |
+
--model_name_or_path zurichNLP/swissbert \
|
101 |
+
--train_file /srv/scratch2/grosjean/Masterarbeit/data_subsets \
|
102 |
+
--output_dir /srv/scratch2/grosjean/Masterarbeit/model \
|
103 |
+
--overwrite_output_dir \
|
104 |
+
--save_strategy no \
|
105 |
+
--do_train \
|
106 |
+
--num_train_epochs 1 \
|
107 |
+
--learning_rate 1e-5 \
|
108 |
+
--per_device_train_batch_size 4 \
|
109 |
+
--gradient_accumulation_steps 128 \
|
110 |
+
--max_seq_length 512 \
|
111 |
+
--overwrite_cache \
|
112 |
+
--pooler_type avg \
|
113 |
+
--pad_to_max_length \
|
114 |
+
--temp 0.05 \
|
115 |
+
--fp16 <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
|
116 |
|
117 |
[More Information Needed]
|
118 |
|
|
|
148 |
|
149 |
|
150 |
|
|
|
|
|
|
|
|
|
|
|
|
|
151 |
## Environmental Impact
|
152 |
|
153 |
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|