--- datasets: - armvectores/hy_wikipedia_2023 pipeline_tag: feature-extraction language: - hy library_name: fasttext --- 414M tokens 1) 73M hy wikipedia 2) 341M arlis database 74951 unique words 3-5 ngrams 5 window length 300 embedding dim skipgram minimum number of words 150 100 epochs, 0.05 start lr 26 hours on 20 xeon gold cores How to use 1) Install fastText ``` pip install fasttext-wheel ``` 2) Import fastText in python ``` import fasttext model = fasttext.load_model('output.bin') model.get_nearest_neighbors('զենքեր') ```