Ic Pro

icpro
Β·

AI & ML interests

None yet

Recent Activity

liked a model 19 days ago
Qwen/Qwen2.5-VL-7B-Instruct
liked a model 20 days ago
ds4sd/SmolDocling-256M-preview
liked a model 7 months ago
black-forest-labs/FLUX.1-dev
View all activity

Organizations

Waidev's profile picture Edumalin's profile picture

icpro's activity

New activity in mistralai/Mistral-7B-Instruct-v0.2 11 months ago

Mistral-7b pre-trained on French

#123 opened 11 months ago by
icpro
reacted to victor's post with πŸ”₯ 11 months ago
view post
Post
4316
The hype is real: a mysterious gpt2-chatbot model has appeared on the LLM Arena Leaderboard πŸ‘€.
It seems to be at least on par with the top performing models (closed and open).

To try it out: https://chat.lmsys.org/ -> then click on the Direct Chat tab and select gpt2-chatbot.

Take your bet, what do you think it is?
Β·
reacted to tomaarsen's post with πŸ”₯ 12 months ago
view post
Post
3189
πŸš€ Sentence Transformers v2.7.0 is out! Featuring a new loss function, easier Matryoshka model inference & evaluation, CrossEncoder improvements & Intel Gaudi2 Accelerator support. Details:

1️⃣ A new loss function: CachedGISTEmbedLoss
This loss function is a combination of CachedMultipleNegativesRankingLoss and the GISTEmbedLoss, both of which are already excellent. The caching mechanism allows for much higher batch sizes with constant memory usage, which boosts training performance. The GIST part introduces a guide model to guide the in-batch negative sample selection. This prevents false negatives, resulting in a stronger training signal.

2️⃣ Automatic Matryoshka model truncation
Matryoshka models produce embeddings that are still useful after truncation. However, this truncation always had to be done manually, until now! We've added a truncate_dim option to the Sentence Transformer constructor. This also allows truncation when using HuggingFaceEmbeddings from LlamaIndex or LangChain.

3️⃣ Additionally, you can now specify truncate_dim in evaluators to get the performance after truncation. (Hint: it's surprisingly good, even for models not trained with MatryoshkaLoss, and it can speed up e.g. clustering, retrieval, etc.)

4️⃣ CrossEncoder improvements
The CrossEncoder now supports 'push_to_hub' to upload trained reranker models to Hugging Face. Additionally, CrossEncoders now support trust_remote_code to load models with custom modelling code.

5️⃣ Inference on Intel Gaudi2
If you have an Intel Gaudi2 Accelerator, Sentence Transformers now uses it automatically for even faster inference. No changes are necessary to your code, the device is automatically detected!

Check out the release notes for all of the details: https://github.com/UKPLab/sentence-transformers/releases/tag/v2.7.0

I'm very excited for the upcoming releases: I'm making great progress with a notable v3 refactor that should heavily improve the training process for embedding models!
  • 2 replies
Β·