File size: 808 Bytes
6448d66 8e4d2d0 a7cd848 6448d66 8e4d2d0 51560d3 8e4d2d0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
---
language:
- fa
library_name: hezar
tags:
- feature-extraction
- hezar
pipeline_tag: feature-extraction
---
This is the Persian word2vec embedding model trained with CBOW algorithm on the CoNLL17 dataset.
In order to use this model in Hezar you can simply use this piece of code:
```bash
pip install hezar
```
```python
from hezar.embeddings import Embedding
w2v = Embedding.load("hezarai/word2vec-cbow-fa-conll17")
# Get embedding vector
vector = w2v("هزار")
# Find the word that doesn't match with the rest
doesnt_match = w2v.doesnt_match(["خانه", "اتاق", "ماشین"])
# Find the top-n most similar words to the given word
most_similar = w2v.most_similar("هزار", top_n=5)
# Find the cosine similarity value between two words
similarity = w2v.similarity("مهندس", "دکتر")
```
|