如何加载模型

#1
by baby666 - opened

由于在运行代码的时候下载模型容易出现问题,所以我先下载了模型到本地。
我首先将模型下载到本地,可是不能进行加载。
以下是我尝试使用的代码:
model = KeyedVectors.load_word2vec_format("zhwiki_20180420_100d.txt")
model = KeyedVectors.load_word2vec_format("zhwiki_20180420_100d.txt", binary=True)
EOFError: unexpected end of input; is count incorrect or file otherwise damaged?

model = Word2Vec.load("zhwiki_20180420_100d.txt")

并且我在KeyedVectors.load_word2vec_format中仔细查找了源代码,并未发现有说明如何加载本地模型。

补充:hf_hub_download函数返回的也是缓存后的字符串地址。为何我将本地模型地址传入却加载不了呢?

Word2vec org

Hi,
It's normal that the line model = KeyedVectors.load_word2vec_format("zhwiki_20180420_100d.txt", binary=True) doesn't work. The binary=True argument can only be used for .bin files, whereas here it is a .txt file.
Gensim is not currently supported by HF which is why I'm downloading the file from the Hub and not from something local.
I should have a discussion with the HF teams about this later in the week (cc @osanseviero for visibility).
So I hope to have some news to share with you on this point in the coming days.

非常感谢解答,希望你和你的团队能早日实现模型的本地加载。

Some news.

The code provided in the model card works fine.
hf_hub_download(repo_id="Word2vec/wikipedia2vec_zhwiki_20180420_100d", filename="zhwiki_20180420_100d.txt") downloads the MODELfrom the Hub and then caches it (see .cache\huggingface\hub).
When the code is relaunched, the file is no longer downloaded from the Hub, as hf_hub_download fetches the previously downloaded file locally.
It should be noted, however, that this download takes a long time to complete (3 min on my side). I observe this long loading time for .txt files (.bin files load almost instantaneously).

In your case, it may be better to work with .bin files (available on https://wikipedia2vec.github.io/wikipedia2vec/pretrained/) and then load it with the Gensim's load_word2vec_format() function (not the Word2Vec.load that we indicated in your first message) with pointing the right path. In the worst case, you can use the https://github.com/wikipedia2vec/wikipedia2vec lib.

Sign up or log in to comment