view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 26 days ago β’ 372
LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation Paper β’ 2411.04997 β’ Published Nov 7, 2024 β’ 39
ReNoise: Real Image Inversion Through Iterative Noising Paper β’ 2403.14602 β’ Published Mar 21, 2024 β’ 21
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference Paper β’ 2403.14520 β’ Published Mar 21, 2024 β’ 35
FlashTex: Fast Relightable Mesh Texturing with LightControlNet Paper β’ 2402.13251 β’ Published Feb 20, 2024 β’ 15
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models Paper β’ 2402.13064 β’ Published Feb 20, 2024 β’ 48