pyvene: A Library for Understanding and Improving PyTorch Models via Interventions Paper • 2403.07809 • Published Mar 12 • 1
🔍 Daily Picks in Interpretability & Analysis of LMs Collection Outstanding research in interpretability and evaluation of language models, summarized • 39 items • Updated 11 days ago • 53
LLM-AD: Large Language Model based Audio Description System Paper • 2405.00983 • Published 11 days ago • 13
Customizing Text-to-Image Models with a Single Image Pair Paper • 2405.01536 • Published 11 days ago • 15
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment Paper • 2405.01481 • Published 11 days ago • 20
FLAME: Factuality-Aware Alignment for Large Language Models Paper • 2405.01525 • Published 11 days ago • 19
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published 11 days ago • 85
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published 14 days ago • 99
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Paper • 2405.01434 • Published 11 days ago • 41
STT: Stateful Tracking with Transformers for Autonomous Driving Paper • 2405.00236 • Published 12 days ago • 6
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published 12 days ago • 18
Clover: Regressive Lightweight Speculative Decoding with Sequential Knowledge Paper • 2405.00263 • Published 12 days ago • 11
Paint by Inpaint: Learning to Add Image Objects by Removing Them First Paper • 2404.18212 • Published 15 days ago • 19
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound Paper • 2405.00233 • Published 12 days ago • 12
Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3 Paper • 2405.00664 • Published 12 days ago • 15
Spectrally Pruned Gaussian Fields with Neural Compensation Paper • 2405.00676 • Published 12 days ago • 8
A Careful Examination of Large Language Model Performance on Grade School Arithmetic Paper • 2405.00332 • Published 12 days ago • 23
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
MicroDreamer: Zero-shot 3D Generation in sim20 Seconds by Score-based Iterative Reconstruction Paper • 2404.19525 • Published 13 days ago • 8
MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model Paper • 2404.19759 • Published 13 days ago • 21
Lightplane: Highly-Scalable Components for Neural 3D Fields Paper • 2404.19760 • Published 13 days ago • 4
Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting Paper • 2404.19758 • Published 13 days ago • 9
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Paper • 2404.19752 • Published 13 days ago • 17
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published 13 days ago • 59
GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting Paper • 2404.19702 • Published 13 days ago • 15
DOCCI: Descriptions of Connected and Contrasting Images Paper • 2404.19753 • Published 13 days ago • 8
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Paper • 2404.19427 • Published 13 days ago • 62
Stylus: Automatic Adapter Selection for Diffusion Models Paper • 2404.18928 • Published 14 days ago • 13
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models Paper • 2404.17672 • Published 17 days ago • 17
DressCode: Autoregressively Sewing and Generating Garments from Text Guidance Paper • 2401.16465 • Published Jan 29 • 7
Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting Paper • 2404.18911 • Published 14 days ago • 25
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published 14 days ago • 62
Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations Paper • 2404.17521 • Published 17 days ago • 12
🦢SWIM-IR Dataset Collection 29 million Synthetic Wikipedia-based Multilingual Retrieval Training Pairs. • 4 items • Updated 15 days ago • 6
HaLo-NeRF: Learning Geometry-Guided Semantics for Exploring Unconstrained Photo Collections Paper • 2404.16845 • Published Feb 14 • 5
AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs Paper • 2404.16873 • Published 21 days ago • 25
MaPa: Text-driven Photorealistic Material Painting for 3D Shapes Paper • 2404.17569 • Published 17 days ago • 10
PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning Paper • 2404.16994 • Published 18 days ago • 30
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs Paper • 2404.16375 • Published 18 days ago • 14
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings Paper • 2404.16820 • Published 18 days ago • 15
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published 18 days ago • 54
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving Paper • 2404.16771 • Published 18 days ago • 16
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Paper • 2404.16821 • Published 18 days ago • 48
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension Paper • 2404.16790 • Published 18 days ago • 7