2 2 9

Ic Pro

icpro

xegulon

AI & ML interests

None yet

Recent Activity

liked a model 19 days ago

Qwen/Qwen2.5-VL-7B-Instruct

liked a model 20 days ago

ds4sd/SmolDocling-256M-preview

liked a model 7 months ago

black-forest-labs/FLUX.1-dev

View all activity

Organizations

icpro's activity

liked a model 19 days ago

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • Updated 1 day ago • 2.11M • • 796

liked a model 20 days ago

ds4sd/SmolDocling-256M-preview

Image-Text-to-Text • Updated 15 days ago • 71.3k • 1.15k

liked a model 7 months ago

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Aug 16, 2024 • 2.17M • • 9.72k

updated 2 models 10 months ago

icpro/trained-model-classification-evaluation

Text Classification • Updated May 31, 2024 • 2

edumalin/edumalin-mixtral-indications-and-evaluation-merged

Text Generation • Updated May 30, 2024 • 3

New activity in mistralai/Mistral-7B-Instruct-v0.2 11 months ago

Mistral-7b pre-trained on French

#123 opened 11 months ago by

icpro

upvoted a paper 11 months ago

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published May 2, 2024 • 122

reacted to victor's post with 🔥 11 months ago

Post

4316

The hype is real: a mysterious gpt2-chatbot model has appeared on the LLM Arena Leaderboard 👀.
It seems to be at least on par with the top performing models (closed and open).

To try it out: https://chat.lmsys.org/ -> then click on the Direct Chat tab and select gpt2-chatbot.

Take your bet, what do you think it is?

4 replies

liked 5 models 11 months ago

liked a Space 12 months ago

12.9k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots

reacted to tomaarsen's post with 🔥 12 months ago

Post

3189

🚀 Sentence Transformers v2.7.0 is out! Featuring a new loss function, easier Matryoshka model inference & evaluation, CrossEncoder improvements & Intel Gaudi2 Accelerator support. Details:

1️⃣ A new loss function: CachedGISTEmbedLoss
This loss function is a combination of CachedMultipleNegativesRankingLoss and the GISTEmbedLoss, both of which are already excellent. The caching mechanism allows for much higher batch sizes with constant memory usage, which boosts training performance. The GIST part introduces a guide model to guide the in-batch negative sample selection. This prevents false negatives, resulting in a stronger training signal.

2️⃣ Automatic Matryoshka model truncation
Matryoshka models produce embeddings that are still useful after truncation. However, this truncation always had to be done manually, until now! We've added a truncate_dim option to the Sentence Transformer constructor. This also allows truncation when using HuggingFaceEmbeddings from LlamaIndex or LangChain.

3️⃣ Additionally, you can now specify truncate_dim in evaluators to get the performance after truncation. (Hint: it's surprisingly good, even for models not trained with MatryoshkaLoss, and it can speed up e.g. clustering, retrieval, etc.)

4️⃣ CrossEncoder improvements
The CrossEncoder now supports 'push_to_hub' to upload trained reranker models to Hugging Face. Additionally, CrossEncoders now support trust_remote_code to load models with custom modelling code.

5️⃣ Inference on Intel Gaudi2
If you have an Intel Gaudi2 Accelerator, Sentence Transformers now uses it automatically for even faster inference. No changes are necessary to your code, the device is automatically detected!

Check out the release notes for all of the details: https://github.com/UKPLab/sentence-transformers/releases/tag/v2.7.0

I'm very excited for the upcoming releases: I'm making great progress with a notable v3 refactor that should heavily improve the training process for embedding models!

2 replies

New activity in mistralai/Mixtral-8x7B-Instruct-v0.1 about 1 year ago

How to format custom dataset to finetune Mixtral with TRL SFT script?

#132 opened about 1 year ago by