jgrosjean commited on
Commit
dbc65e7
1 Parent(s): 3e2d610

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -23
README.md CHANGED
@@ -6,7 +6,7 @@
6
 
7
  <!-- Provide a quick summary of what the model is/does. -->
8
 
9
- The [SwissBERT](https://huggingface.co/ZurichNLP/swissbert) model finetuned via [SimCSE](http://dx.doi.org/10.18653/v1/2021.emnlp-main.552) (Gao et al., EMNLP 2021) for sentence embeddings, using ~1 million Swiss news articles published in 2022 from [Swissdox@LiRI](https://t.uzh.ch/1hI). Following [Sentence Transformers](https://huggingface.co/sentence-transformers)) example (Reimers and Gurevych,
10
  2019), the average of the last hidden states (pooler_type=avg) is used as sentence representation.
11
 
12
  The fine-tuning script can be accessed [here](Link).
@@ -19,43 +19,62 @@ The fine-tuning script can be accessed [here](Link).
19
 
20
  <!-- Provide a longer summary of what this model is. -->
21
 
 
 
 
 
 
22
 
 
23
 
24
- - **Developed by:** [More Information Needed]
25
- - **Funded by [optional]:** [More Information Needed]
26
- - **Shared by [optional]:** [More Information Needed]
27
- - **Model type:** [More Information Needed]
28
- - **Language(s) (NLP):** [More Information Needed]
29
- - **License:** [More Information Needed]
30
- - **Finetuned from model [optional]:** [More Information Needed]
31
 
32
- ### Model Sources [optional]
 
33
 
34
- <!-- Provide the basic links for the model. -->
35
 
36
- - **Repository:** [More Information Needed]
37
- - **Paper [optional]:** [More Information Needed]
38
- - **Demo [optional]:** [More Information Needed]
39
 
40
- ## Uses
41
 
42
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 
 
 
 
 
 
43
 
44
- ### Direct Use
 
45
 
46
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
47
 
48
- [More Information Needed]
 
 
49
 
50
- ### Downstream Use [optional]
 
51
 
52
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 
 
 
 
 
 
 
 
 
 
 
53
 
54
  [More Information Needed]
55
 
56
- ### Out-of-Scope Use
57
 
58
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
59
 
60
  [More Information Needed]
61
 
@@ -63,7 +82,7 @@ The fine-tuning script can be accessed [here](Link).
63
 
64
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
65
 
66
- [More Information Needed]
67
 
68
  ### Recommendations
69
 
 
6
 
7
  <!-- Provide a quick summary of what the model is/does. -->
8
 
9
+ The [SwissBERT](https://huggingface.co/ZurichNLP/swissbert) model finetuned via [SimCSE](http://dx.doi.org/10.18653/v1/2021.emnlp-main.552) (Gao et al., EMNLP 2021) for sentence embeddings, using ~1 million Swiss news articles published in 2022 from [Swissdox@LiRI](https://t.uzh.ch/1hI). Following the [Sentence Transformers](https://huggingface.co/sentence-transformers) approach (Reimers and Gurevych,
10
  2019), the average of the last hidden states (pooler_type=avg) is used as sentence representation.
11
 
12
  The fine-tuning script can be accessed [here](Link).
 
19
 
20
  <!-- Provide a longer summary of what this model is. -->
21
 
22
+ - **Developed by:** [Juri Grosjean](https://huggingface.co/jgrosjean)
23
+ - **Model type:** [XMOD](https://huggingface.co/facebook/xmod-base)
24
+ - **Language(s) (NLP):** [de_CH, fr_CH, it_CH, rm_CH]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model:** [SwissBERT](https://huggingface.co/ZurichNLP/swissbert)
27
 
28
+ ## Use
29
 
30
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 
 
 
 
 
 
31
 
32
+ ```python
33
+ import torch
34
 
35
+ from transformers import AutoModel, AutoTokenizer
36
 
 
 
 
37
 
 
38
 
39
+ ### German example
40
+ ```python
41
+ def generate_sentence_embedding(sentence, model_name="jgrosjean-mathesis/swissbert-for-sentence-embeddings"):
42
+ # Load swissBERT model
43
+ model = AutoModel.from_pretrained(model_name)
44
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
45
+ model.set_default_language("de_CH")
46
 
47
+ # Tokenize input sentence
48
+ inputs = tokenizer(sentence, padding=True, truncation=True, return_tensors="pt", max_length=512)
49
 
50
+ # Set the model to evaluation mode
51
+ model.eval()
52
 
53
+ # Take tokenized input and pass it through the model
54
+ with torch.no_grad():
55
+ outputs = model(**inputs)
56
 
57
+ # Extract average sentence embeddings from the last hidden layer
58
+ embedding = outputs.last_hidden_state.mean(dim=1)
59
 
60
+ return embedding
61
+
62
+ sentence_embedding = generate_sentence_embedding("Wir feiern am 1. August den Schweizer Nationalfeiertag.")
63
+ print(sentence_embedding)
64
+ ```
65
+ Output:
66
+ ```
67
+ tensor([[ 5.6306e-02, -2.8375e-01, -4.1495e-02, 7.4393e-02, -3.1552e-01,
68
+ 1.5213e-01, -1.0258e-01, 2.2790e-01, -3.5968e-02, 3.1769e-01,
69
+ 1.9354e-01, 1.9748e-02, -1.5236e-01, -2.2657e-01, 1.3345e-02,
70
+ ...]])
71
+ ```
72
 
73
  [More Information Needed]
74
 
75
+ ### Downstream Use [optional]
76
 
77
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
78
 
79
  [More Information Needed]
80
 
 
82
 
83
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
84
 
85
+ This multilingual model has not been fine-tuned for cross-lingual transfer. It is intended for computing sentence embeddings that can be compared mono-lingually.
86
 
87
  ### Recommendations
88