BayanDuygu commited on
Commit
afad42c
1 Parent(s): 767b933

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -0
README.md CHANGED
@@ -1,3 +1,43 @@
1
  ---
 
 
 
 
 
 
 
 
2
  license: cc-by-sa-4.0
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags:
3
+ - spacy
4
+ - floret
5
+ - fasttext
6
+ - feature-extraction
7
+ - token-classification
8
+ language:
9
+ - tr
10
  license: cc-by-sa-4.0
11
+ model-index:
12
+ - name: tr_vectors_web_lg
13
+ results:
14
+ - task:
15
+ name: NMT
16
+ type: token-classification
17
+ metrics:
18
+ - name: Accuracy
19
+ type: accuracy
20
+ value: 0.1112
21
+
22
  ---
23
+ Medium sized Turkish Floret word vectors for spaCy.
24
+
25
+ The vectors are trained on MC4 corpus using Floret with the following hyperparameters:
26
+
27
+ ```
28
+ floret cbow -dim 300 --mode floret --bucket 200000 -minn 4 -maxn5 -minCount 100
29
+ -neg 10 -hashCount 2 -thread 12 -epoch 5
30
+ ```
31
+
32
+ Vector are published in Floret format.
33
+
34
+ | Feature | Description |
35
+ | --- | --- |
36
+ | **Name** | `tr_vectors_web_lg` |
37
+ | **Version** | `1.0` |
38
+ | **Vectors** | 200000 keys (300 dimensions) |
39
+ | **Sources** | [MC4](https://arxiv.org/abs/1910.10683) |
40
+ | **License** | `cc-by-sa-4.0` |
41
+ | **Author** | [Duygu Altinok](https://www.onlyduygu.com/) |
42
+
43
+