-
MM-VID: Advancing Video Understanding with GPT-4V(ision)
Paper • 2310.19773 • Published • 18 -
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models
Paper • 2310.05863 • Published • 1 -
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Paper • 2311.06242 • Published • 24 -
I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization
Paper • 2311.10126 • Published • 7
Collections
Discover the best community collections!
Collections including paper arxiv:2310.19773
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 39 -
De-Diffusion Makes Text a Strong Cross-Modal Interface
Paper • 2311.00618 • Published • 21 -
MM-VID: Advancing Video Understanding with GPT-4V(ision)
Paper • 2310.19773 • Published • 18 -
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
Paper • 2310.15308 • Published • 22
-
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 5 -
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 13 -
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Paper • 2201.12086 • Published • 2 -
ImageNetVC: Zero-Shot Visual Commonsense Evaluation on 1000 ImageNet Categories
Paper • 2305.15028 • Published • 1
-
Sparse Autoencoders Find Highly Interpretable Features in Language Models
Paper • 2309.08600 • Published • 11 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 79 -
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Paper • 2309.08968 • Published • 22 -
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning
Paper • 2309.15091 • Published • 31
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 21 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 8 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 6