5 6 4

Vatolin Alexey

vatolinalex

AI & ML interests

None yet

Recent Activity

upvoted a paper 16 days ago

Training Sparse Mixture Of Experts Text Embedding Models

liked a model 16 days ago

EuroBERT/EuroBERT-210m

reacted to tomaarsen's post with ❤️ 16 days ago

An assembly of 18 European companies, labs, and universities have banded together to launch 🇪🇺 EuroBERT! It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc. 🇪🇺 15 Languages: English, French, German, Spanish, Chinese, Italian, Russian, Polish, Portuguese, Japanese, Vietnamese, Dutch, Arabic, Turkish, Hindi 3️⃣ 3 model sizes: 210M, 610M, and 2.1B parameters - very very useful sizes in my opinion ➡️ Sequence length of 8192 tokens! Nice to see these higher sequence lengths for encoders becoming more common. ⚙️ Architecture based on Llama, but with bi-directional (non-causal) attention to turn it into an encoder. Flash Attention 2 is supported. 🔥 A new Pareto frontier (stronger *and* smaller) for multilingual encoder models 📊 Evaluated against mDeBERTa, mGTE, XLM-RoBERTa for Retrieval, Classification, and Regression (after finetuning for each task separately): EuroBERT punches way above its weight. 📝 Detailed paper with all details, incl. data: FineWeb for English and CulturaX for multilingual data, The Stack v2 and Proof-Pile-2 for code. Check out the release blogpost here: https://huggingface.co/blog/EuroBERT/release * https://huggingface.co/EuroBERT/EuroBERT-210m * https://huggingface.co/EuroBERT/EuroBERT-610m * https://huggingface.co/EuroBERT/EuroBERT-2.1B The next step is for researchers to build upon the 3 EuroBERT base models and publish strong retrieval, zero-shot classification, etc. models for all to use. I'm very much looking forward to it!

View all activity

Organizations

vatolinalex's activity

New activity in Alibaba-NLP/gte-Qwen2-1.5B-instruct 3 months ago

Fixed some minor bugs in eval_mteb.py

#27 opened 3 months ago by

vatolinalex

New activity in Alibaba-NLP/gte-Qwen1.5-7B-instruct 3 months ago

Fixed some minor bugs in eval_mteb.py

#21 opened 3 months ago by

vatolinalex

New activity in Alibaba-NLP/gte-Qwen2-1.5B-instruct 3 months ago

Fixed some minor bugs in eval_mteb.py

#26 opened 3 months ago by

vatolinalex

New activity in nvidia/NV-Embed-v2 3 months ago

Discrepancy in Model Outputs Between Transformers and Sentence Transformers

#29 opened 3 months ago by

vatolinalex

New activity in Vikhrmodels/habr_qa_sbs 3 months ago

Fix dataset reading error

#1 opened 3 months ago by

vatolinalex

Fix dataset reading error

#1 opened 3 months ago by

vatolinalex