Theodore Speak

theospeak

AI & ML interests

None yet

Recent Activity

liked a Space 15 days ago
webml-community/llama-3.2-webgpu
liked a Space 20 days ago
VIDraft/mouse1
liked a model 20 days ago
rhymes-ai/Allegro-TI2V
View all activity

Organizations

None yet

theospeak's activity

reacted to merve's post with ๐Ÿ”ฅ about 1 month ago
view post
Post
1663
Microsoft released LLM2CLIP: a CLIP model with longer context window for complex text inputs ๐Ÿคฏ
All models with Apache 2.0 license here microsoft/llm2clip-672323a266173cfa40b32d4c

TLDR; they replaced CLIP's text encoder with various LLMs fine-tuned on captioning, better top-k accuracy on retrieval.
This will enable better image-text retrieval, better zero-shot image classification, better vision language models ๐Ÿ”ฅ
Read the paper to learn more: LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation (2411.04997)