Yury Kuratov's picture

Yury Kuratov

yurakuratov

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 16 hours ago

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

updated a dataset 2 days ago

RMT-team/babilong_evals

published a dataset 6 days ago

RMT-team/babilong_evals

View all activity

Organizations

yurakuratov's activity

upvoted a paper about 16 hours ago

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

Paper • 2504.06261 • Published 1 day ago • 76

upvoted a paper 16 days ago

A Comprehensive Survey on Long Context Language Modeling

Paper • 2503.17407 • Published 21 days ago • 49

upvoted a collection 18 days ago

Gemma 3 Release

17 items • Updated 7 days ago • 320

upvoted a paper 28 days ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published 29 days ago • 68

upvoted 2 papers about 1 month ago

MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections

Paper • 2502.12170 • Published Feb 13 • 12

LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

Paper • 2502.15007 • Published Feb 20 • 172

upvoted 2 papers about 2 months ago

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published Feb 20 • 89

Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity

Paper • 2502.13063 • Published Feb 18 • 69

upvoted a paper 3 months ago

SRMT: Shared Memory for Multi-agent Lifelong Pathfinding

Paper • 2501.13200 • Published Jan 22 • 68

upvoted a collection 3 months ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 588

upvoted a paper 4 months ago

Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation

Paper • 2412.06531 • Published Dec 9, 2024 • 72

upvoted a collection 4 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Feb 26 • 585

upvoted 2 papers 7 months ago

MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale

Paper • 2409.00134 • Published Aug 29, 2024 • 2

Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing

Paper • 2409.01322 • Published Sep 2, 2024 • 97

upvoted a collection 8 months ago

DNA language models

9 items • Updated Apr 17, 2024 • 7

upvoted 4 papers 9 months ago

POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation

Paper • 2407.14931 • Published Jul 20, 2024 • 22

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Paper • 2407.04620 • Published Jul 5, 2024 • 32

Associative Recurrent Memory Transformer

Paper • 2407.04841 • Published Jul 5, 2024 • 37

AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents

Paper • 2407.04363 • Published Jul 5, 2024 • 32

upvoted a paper 10 months ago

Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task

Paper • 2406.14213 • Published Jun 20, 2024 • 21