sagorsarker
/

bangla-fasttext

Model card Files Files and versions Community

Bangla FastText Model

This is a FastText pre-trained model for the Bengali language.

This model is build for bnlp package.

Datasets

Wikipedia dump datasets

Training Details

Fasttext trained with total words = 20M, vocab size = 1171011, epoch=50, embedding dimension = 300

Evaluation Details

training loss = 0.318668

Usage

pip install -U bnlp_toolkit
pip install fasttext==0.9.2
Generate Vector Using Pretrained Model

from bnlp.embedding.fasttext import BengaliFasttext

bft = BengaliFasttext()
word = "গ্রাম"
model_path = "bengali_fasttext_wiki.bin"
word_vector = bft.generate_word_vector(model_path, word)
print(word_vector.shape)
print(word_vector)

Train Bengali FastText Model

from bnlp.embedding.fasttext import BengaliFasttext

bft = BengaliFasttext()
data = "raw_text.txt"
model_name = "saved_model.bin"
epoch = 50
bft.train(data, model_name, epoch)

Generate Vector File from Fasttext Binary Model

from bnlp.embedding.fasttext import BengaliFasttext

bft = BengaliFasttext()

model_path = "mymodel.bin"
out_vector_name = "myvector.txt"
bft.bin2vec(model_path, out_vector_name)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.