xmxx
's Collections
Daily paper that worth reading in details later
updated
Paper
•
2402.13144
•
Published
•
95
Genie: Generative Interactive Environments
Paper
•
2402.15391
•
Published
•
70
Sora: A Review on Background, Technology, Limitations, and Opportunities
of Large Vision Models
Paper
•
2402.17177
•
Published
•
88
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper
•
2403.00522
•
Published
•
44
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Paper
•
2403.03206
•
Published
•
60
Stealing Part of a Production Language Model
Paper
•
2403.06634
•
Published
•
90
Gemma: Open Models Based on Gemini Research and Technology
Paper
•
2403.08295
•
Published
•
47
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion
Distillation
Paper
•
2403.12015
•
Published
•
64
Mixture-of-Depths: Dynamically allocating compute in transformer-based
language models
Paper
•
2404.02258
•
Published
•
104
Leave No Context Behind: Efficient Infinite Context Transformers with
Infini-attention
Paper
•
2404.07143
•
Published
•
104
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
Phone
Paper
•
2404.14219
•
Published
•
253
The Instruction Hierarchy: Training LLMs to Prioritize Privileged
Instructions
Paper
•
2404.13208
•
Published
•
38
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Paper
•
2404.16710
•
Published
•
75
What matters when building vision-language models?
Paper
•
2405.02246
•
Published
•
101
RLHF Workflow: From Reward Modeling to Online RLHF
Paper
•
2405.07863
•
Published
•
66
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper
•
2405.09818
•
Published
•
126
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper
•
2405.12981
•
Published
•
28
To Believe or Not to Believe Your LLM
Paper
•
2406.02543
•
Published
•
32
ShareGPT4Video: Improving Video Understanding and Generation with Better
Captions
Paper
•
2406.04325
•
Published
•
72
Long Context Transfer from Language to Vision
Paper
•
2406.16852
•
Published
•
32
LongIns: A Challenging Long-context Instruction-based Exam for LLMs
Paper
•
2406.17588
•
Published
•
22
PaliGemma: A versatile 3B VLM for transfer
Paper
•
2407.07726
•
Published
•
68