theroot's picture
initial release
524aebf
---
tags:
- glove
- gensim
- fse
---
# Word2Vec
Pre-trained vectors trained on a part of the Google News dataset (about 100 billion words). The model contains 300-dimensional vectors for 3 million words and phrases. The phrases were obtained using a simple data-driven approach described in 'Distributed Representations of Words and Phrases and their Compositionality'
Read more:
* https://code.google.com/archive/p/word2vec/
* https://arxiv.org/abs/1301.3781
* https://arxiv.org/abs/1310.4546
* https://www.microsoft.com/en-us/research/publication/linguistic-regularities-in-continuous-space-word-representations/?from=http%3A%2F%2Fresearch.microsoft.com%2Fpubs%2F189726%2Frvecs.pdf