Sigurdur commited on
Commit
a8d1e15
1 Parent(s): c682154

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -1
README.md CHANGED
@@ -8,11 +8,22 @@ tags:
8
 
9
  ---
10
 
11
- # {MODEL_NAME}
12
 
13
  This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
14
 
15
  <!--- Describe your model here -->
 
 
 
 
 
 
 
 
 
 
 
16
 
17
  ## Usage (Sentence-Transformers)
18
 
 
8
 
9
  ---
10
 
11
+ # Icelandic SBERT for Sentence Embedding
12
 
13
  This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
14
 
15
  <!--- Describe your model here -->
16
+ ## Data
17
+
18
+ from clarin-is: [unanotated news2 from IGC(RMH)](https://repository.clarin.is/repository/xmlui/handle/20.500.12537/238)
19
+
20
+ I figured the most modern and common sentences would appear in the news, so I chose this dataset. For more sophisticated language the books dataset would be better.
21
+
22
+ to install the data, run the following command:
23
+
24
+ ```bash
25
+ curl --remote-name-all https://repository.clarin.is/repository/xmlui/bitstream/handle/20.500.12537/238{/IGC-News2-22.10.TEI.zip}
26
+ ```
27
 
28
  ## Usage (Sentence-Transformers)
29