matlok
's Collections
Papers - Attention - Cross
updated
Vid2Robot: End-to-end Video-conditioned Policy Learning with
Cross-Attention Transformers
Paper
•
2403.12943
•
Published
•
14
Masked Audio Generation using a Single Non-Autoregressive Transformer
Paper
•
2401.04577
•
Published
•
42
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion
Models
Paper
•
2404.02747
•
Published
•
11
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image
Generation
Paper
•
2404.02733
•
Published
•
20
Prompt-to-Prompt Image Editing with Cross Attention Control
Paper
•
2208.01626
•
Published
•
2
Paper
•
2404.07821
•
Published
•
11
HSIDMamba: Exploring Bidirectional State-Space Models for Hyperspectral
Denoising
Paper
•
2404.09697
•
Published
•
1
TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal
Large Language Models
Paper
•
2404.09204
•
Published
•
10
Long-form music generation with latent diffusion
Paper
•
2404.10301
•
Published
•
24
GLIGEN: Open-Set Grounded Text-to-Image Generation
Paper
•
2301.07093
•
Published
•
3
MultiBooth: Towards Generating All Your Concepts in an Image from Text
Paper
•
2404.14239
•
Published
•
9
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
Paper
•
2404.15420
•
Published
•
7
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation
Paper
•
2404.19427
•
Published
•
71
Unveiling Encoder-Free Vision-Language Models
Paper
•
2406.11832
•
Published
•
50
TokenFormer: Rethinking Transformer Scaling with Tokenized Model
Parameters
Paper
•
2410.23168
•
Published
•
24
HAT: Hybrid Attention Transformer for Image Restoration
Paper
•
2309.05239
•
Published
•
1
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper
•
2412.09871
•
Published
•
85