--- tags: - spacy - floret - fasttext - feature-extraction - token-classification language: - hu license: cc-by-sa-4.0 model-index: - name: hu_vectors_web_md results: - task: name: Analogical questions type: token-classification metrics: - name: Accuracy type: accuracy value: 0.1010 - name: MRR type: mrr value: 0.1772 --- Hungarian word vectors for HuSpaCy. The model is trained on the Hungarian Webcorpus 2.0 using floret with the following hyperparameters: `floret cbow -dim 100 -mode floret -bucket 200000 -minn 4 -maxn 6 -minCount 100 -neg 10 -hashCount 2 -lr 0.1 -thread 30 -epoch 5` Vectors are published in fasttext and floret format. | Feature | Description | | --- | --- | | **Name** | `hu_vectors_web_lg` | | **Version** | `1.0` | | **Vectors** | 200000 keys (300 dimensions) | | **Sources** | [Hungarian Webcorpus 2.0](https://hlt.bme.hu/en/resources/webcorpus2) (Dávid Márk Nemeskey (SZTAKI-HLT)) | | **License** | `cc-by-sa-4.0` | | **Author** | [SzegedAI, MILAB](https://github.com/huspacy/huspacy) | ### Accuracy | Type | Score | | --- | --- | | `ACC` | 10.10 | | `MRR` | 0.1772 |