The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models Paper • 2311.05928 • Published Nov 10, 2023 • 1
Memory-Efficient Backpropagation through Large Linear Layers Paper • 2201.13195 • Published Jan 31, 2022 • 1
Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction Paper • 2202.00441 • Published Feb 1, 2022 • 1
General Covariance Data Augmentation for Neural PDE Solvers Paper • 2301.12730 • Published Jan 30, 2023
CLEAR: Character Unlearning in Textual and Visual Modalities Paper • 2410.18057 • Published Oct 23, 2024 • 209
ConDiff: A Challenging Dataset for Neural Solvers of Partial Differential Equations Paper • 2406.04709 • Published Jun 7, 2024
MaxInfo: A Training-Free Key-Frame Selection Method Using Maximum Volume for Enhanced Video Understanding Paper • 2502.03183 • Published Feb 5 • 1
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published 13 days ago • 112
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published 13 days ago • 112
Combining Flow Matching and Transformers for Efficient Solution of Bayesian Inverse Problems Paper • 2503.01375 • Published Mar 3 • 5
Combining Flow Matching and Transformers for Efficient Solution of Bayesian Inverse Problems Paper • 2503.01375 • Published Mar 3 • 5
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published Feb 20 • 170
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity Paper • 2502.13063 • Published Feb 18 • 69
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation Paper • 2412.06531 • Published Dec 9, 2024 • 72
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation Paper • 2410.05363 • Published Oct 7, 2024 • 45
Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing Paper • 2409.01322 • Published Sep 2, 2024 • 96