bangla_word2vec / README.md
sagorsarker's picture
Update README.md
c16ac24
---
license: mit
---
# Bengali Word2Vec Model
This is a pre-trained word2vec model for Bengali language.
This model is build for [bnlp](https://github.com/sagorbrur/bnlp) package.
## Datasets
- [Wikipedia dump datasets](https://dumps.wikimedia.org/bnwiki/latest/)
## Training details
- Word2Vec word embedding dimension = 100, min_count=5, window=5, epochs=10
## Usage
- `pip install -U bnlp_toolkit`
- Generate Vector using pretrain model
```py
from bnlp import BengaliWord2Vec
bwv = BengaliWord2Vec()
model_path = "bengali_word2vec.model"
word = 'গ্রাম'
vector = bwv.generate_word_vector(model_path, word)
print(vector.shape)
print(vector)
```
- Find Most Similar Word Using Pretrained Model
```py
from bnlp import BengaliWord2Vec
bwv = BengaliWord2Vec()
model_path = "bengali_word2vec.model"
word = 'গ্রাম'
similar = bwv.most_similar(model_path, word, topn=10)
print(similar)
```