arxyzan commited on
Commit
8a8a7a3
·
1 Parent(s): 196a8b3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -0
README.md ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - fa
4
+ pipeline_tag: feature-extraction
5
+ ---
6
+ This is the Persian word2vec embedding model trained with CBOW algorithm on the wikipedia data.
7
+
8
+ In order to use this model in Hezar you can simply use this piece of code:
9
+ ```bash
10
+ pip install hezar
11
+ ```
12
+ ```python
13
+ from hezar import Embedding
14
+
15
+ w2v = Embedding.load("hezarai/word2vec-cbow-fa-wikipedia")
16
+ # Get embedding vector
17
+ vector = w2v("هزار")
18
+ # Find the word that doesn't match with the rest
19
+ doesnt_match = w2v.doesnt_match(["خانه", "اتاق", "ماشین"])
20
+ # Find the top-n most similar words to the given word
21
+ most_similar = w2v.most_similar("هزار", top_n=5)
22
+ # Find the cosine similarity value between two words
23
+ similarity = w2v.similarity("مهندس", "دکتر")
24
+ ```