Dmitry Balobin's picture

Dmitry Balobin

d0rj

·

AI & ML interests

NLP and 🥴 tensors. MIPT 💙, 2GIS 💚

Recent Activity

liked a model 1 day ago

deepvk/USER2-small

liked a model 1 day ago

deepvk/USER2-base

liked a model 4 days ago

nyuuzyou/SmolLM2-135M-Eagle

View all activity

Organizations

None yet

d0rj's activity

upvoted a collection 4 days ago

blt

4 items • Updated 5 days ago • 16

upvoted a collection 10 days ago

SANA-Sprint

🏃SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation • 6 items • Updated 5 days ago • 35

upvoted a collection 22 days ago

Sana

⚡️Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer • 21 items • Updated 5 days ago • 90

upvoted a collection 25 days ago

DRAMA

A collection of small (sub-1B) multilingual dense retrievers that generalize well across a number of tasks and languages. • 3 items • Updated Feb 26 • 6

upvoted a collection 29 days ago

SuperBPE

SuperBPE tokenizers and models trained with them • 8 items • Updated 12 days ago • 14

upvoted 3 collections about 1 month ago

Reasoning Dataset

7 items • Updated Mar 7 • 3

Datasets [RU]

SFT / RL high-quality datasets • 9 items • Updated 4 days ago • 2

Gemma 3 Release

24 items • Updated 4 days ago • 341

upvoted a paper about 1 month ago

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 230

upvoted a paper 2 months ago

SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

Paper • 2502.06394 • Published Feb 10 • 90

upvoted a collection 3 months ago

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 21 items • Updated 7 days ago • 128

upvoted a paper 3 months ago

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11, 2024 • 94

upvoted a collection 3 months ago

Ru Dialogue Benchmarks

A collection of benchmarks for evaluating the quality of dialogue models in Russian. • 3 items • Updated Jan 15 • 2

upvoted 2 papers 4 months ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 101

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 39

upvoted a collection 4 months ago

T-pro-1.0

5 items • Updated Jan 15 • 6

upvoted a collection 6 months ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated Feb 20 • 253

upvoted a paper 7 months ago

PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation

Paper • 2409.06820 • Published Sep 10, 2024 • 69

upvoted 2 collections 7 months ago

SAGE v1.1.0 release

4 items • Updated 27 days ago • 5

WebInstruct 🌐 Embeddings 🧱 Models

A collection of SoTA embeddings model fine-tuned on WebInstruct dataset to learn to pair instructions with its responses • 3 items • Updated Sep 4, 2024 • 11