Vlad Iliescu

vladi

AI & ML interests

None yet

Recent Activity

liked a model 27 days ago

thenlper/gte-small

liked a model 27 days ago

mixedbread-ai/mxbai-embed-large-v1

liked a model 27 days ago

thenlper/gte-large

View all activity

Organizations

None yet

vladi's activity

liked 4 models 27 days ago

thenlper/gte-small

mixedbread-ai/mxbai-embed-large-v1

Feature Extraction • Updated 27 days ago • 2.73M • • 655

thenlper/gte-large

nomic-ai/modernbert-embed-base

liked a model 28 days ago

facebook/bart-large-mnli

Zero-Shot Classification • Updated Sep 5, 2023 • 3.11M • • 1.35k

liked a model 29 days ago

Alibaba-NLP/gte-multilingual-base

liked 2 Spaces about 1 month ago

5.35k

MTEB Leaderboard

🥇

Embedding Leaderboard

PDFParsersPlayground

🐢

Convert PDFs to Markdown with open-source parsers

liked a model about 1 month ago

Wan-AI/Wan2.1-T2V-14B

Text-to-Video • Updated 28 days ago • 84.1k • • 1.16k

liked a model 2 months ago

stepfun-ai/GOT-OCR2_0

Image-Text-to-Text • Updated Feb 4 • 94.7k • 1.45k

liked a model 12 months ago

multimodalart/sdxl_perturbed_attention_guidance

Unconditional Image Generation • Updated Apr 17, 2024 • 68

reacted to MoritzLaurer's post with 👍 about 1 year ago

Post

Prompts are hyperparameters. Every time you test a different prompt on your data, you become less sure if the LLM actually generalizes to unseen data.

Issues of overfitting to a test set seem like concepts from boring times when people still fine-tuned models, but it's just as important for "zeroshot prompting". Using a separate validation split to tune the main hyperparameter of LLMs (the prompt) is just as important as train-val-test splitting for fine-tuning. The only difference is that you don't have a training dataset anymore and it somehow feels different because there is no training / no parameter updates.

Its easy to trick yourself into believing that an LLM performs well on your task, while you've actually overfit the prompt on your data. Every good "zeroshot" paper should clarify that they used a validation split for finding their prompt before final testing.

1 reply

liked 2 models about 1 year ago

roborovski/superprompt-v1

Text2Text Generation • Updated Jul 3, 2024 • 43.2k • 82

Salesforce/moirai-1.0-R-small

Time Series Forecasting • Updated Jan 21 • 80.4k • 26

liked a model over 1 year ago

codellama/CodeLlama-34b-Instruct-hf

Text Generation • Updated Apr 12, 2024 • 26.3k • • 283

liked 3 models almost 2 years ago