Update usage with infinity
#36
by
michaelfeil
- opened
Text-1.5 is currently one of the most deployed models on infinity. I would therefore like to add a note to the Readme on how to use!
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO 2024-11-13 00:41:05,291 infinity_emb INFO: infinity_server.py:89
Creating 1engines:
engines=['nomic-ai/nomic-embed-text-v1.5']
INFO 2024-11-13 00:41:05,295 infinity_emb INFO: Anonymized telemetry.py:30
telemetry can be disabled via environment variable
`DO_NOT_TRACK=1`.
INFO 2024-11-13 00:41:05,303 infinity_emb INFO: select_model.py:64
model=`nomic-ai/nomic-embed-text-v1.5` selected,
using engine=`torch` and device=`cuda`
INFO 2024-11-13 00:41:05,489 SentenceTransformer.py:216
sentence_transformers.SentenceTransformer
INFO: Load pretrained SentenceTransformer:
nomic-ai/nomic-embed-text-v1.5
A new version of the following files was downloaded from https://huggingface.co/nomic-ai/nomic-bert-2048:
- configuration_hf_nomic_bert.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
A new version of the following files was downloaded from https://huggingface.co/nomic-ai/nomic-bert-2048:
- modeling_hf_nomic_bert.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
WARNING 2024-11-13 00:41:28,547 modeling_hf_nomic_bert.py:443
transformers_modules.nomic-ai.nomic-bert-
2048.c1b1fd7a715b8eb2e232d34593154ac782c9
8ac9.modeling_hf_nomic_bert WARNING: <All
keys matched successfully>
INFO 2024-11-13 00:41:31,438 infinity_emb INFO: Getting select_model.py:97
timings for batch_size=8 and avg tokens per
sentence=1
0.93 ms tokenization
7.62 ms inference
0.09 ms post-processing
8.64 ms total
embeddings/sec: 926.20
INFO 2024-11-13 00:41:31,504 infinity_emb INFO: Getting select_model.py:103
timings for batch_size=8 and avg tokens per
sentence=512
6.18 ms tokenization
14.68 ms inference
0.11 ms post-processing
20.97 ms total
embeddings/sec: 381.43
thanks for this!
zpn
changed pull request status to
merged