language: en tags: - word2vec - glove - embeddings - pretrained - NLP library_name: gensim
π§ GloVe Word2Vec Embeddings (glove.6B pretrained)
This repository contains multiple pretrained GloVe embeddings (6B tokens) converted into Word2Vec format using gensim. These embeddings can be directly loaded and used for downstream NLP tasks such as classification, clustering, and semantic similarity.
ποΈ Included Files
| Filename | Description |
|---|---|
glove.6B.50d.txt |
50-dimensional GloVe vectors |
glove.6B.100d.txt |
100-dimensional GloVe vectors |
glove.6B.100d.word2vec.txt |
Converted version of 100d in Word2Vec format |
glove.6B.200d.txt |
200-dimensional GloVe vectors |
glove.6B.300d.txt |
300-dimensional GloVe vectors |
glove_word2vec_model.pkl |
Gensim KeyedVectors object for 100d model (pickled) |
π¦ Usage
Load .txt files with Gensim
from gensim.models import KeyedVectors
# Load Word2Vec format (text)
model = KeyedVectors.load_word2vec_format("glove.6B.100d.word2vec.txt", binary=False)
Load .pkl file (fastest)
from gensim.models import KeyedVectors
model = KeyedVectors.load("glove_word2vec_model.pkl")
π Source Original GloVe embeddings: https://nlp.stanford.edu/projects/glove/
Converted using gensim.scripts.glove2word2vec
π‘ License MIT License. These embeddings are originally distributed by Stanford NLP under a permissive license.
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support