hezarai
/

word2vec-cbow-fa-wikipedia

Feature Extraction

Model card Files Files and versions Community

word2vec-cbow-fa-wikipedia / README.md

arxyzan's picture

Update README.md

4f587a2 12 months ago

|

history blame contribute delete

808 Bytes

	---
	language:
	- fa
	library_name: hezar
	tags:
	- feature-extraction
	- hezar
	pipeline_tag: feature-extraction
	---
	This is the Persian word2vec embedding model trained with CBOW algorithm on the wikipedia data.

	In order to use this model in Hezar you can simply use this piece of code:
	```bash
	pip install hezar
	```
	```python
	from hezar.embeddings import Embedding

	w2v = Embedding.load("hezarai/word2vec-cbow-fa-wikipedia")
	# Get embedding vector
	vector = w2v("هزار")
	# Find the word that doesn't match with the rest
	doesnt_match = w2v.doesnt_match(["خانه", "اتاق", "ماشین"])
	# Find the top-n most similar words to the given word
	most_similar = w2v.most_similar("هزار", top_n=5)
	# Find the cosine similarity value between two words
	similarity = w2v.similarity("مهندس", "دکتر")
	```