Granite Code Models: A Family of Open Foundation Models for Code Intelligence Paper • 2405.04324 • Published 4 days ago • 9
WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space Paper • 2311.13570 • Published Nov 22, 2023 • 2
Data-Efficient Multimodal Fusion on a Single GPU Paper • 2312.10144 • Published Dec 15, 2023 • 6
EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision Paper • 2311.02077 • Published Nov 3, 2023 • 14
Spectrally Pruned Gaussian Fields with Neural Compensation Paper • 2405.00676 • Published 9 days ago • 8
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published 12 days ago • 90
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published 9 days ago • 17
Paint by Inpaint: Learning to Add Image Objects by Removing Them First Paper • 2404.18212 • Published 13 days ago • 19
Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting Paper • 2404.19758 • Published 10 days ago • 9
GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting Paper • 2404.19702 • Published 10 days ago • 15
MicroDreamer: Zero-shot 3D Generation in sim20 Seconds by Score-based Iterative Reconstruction Paper • 2404.19525 • Published 11 days ago • 8
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Paper • 2404.19427 • Published 11 days ago • 62
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published 10 days ago • 58
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models Paper • 2404.17672 • Published 14 days ago • 17
MaPa: Text-driven Photorealistic Material Painting for 3D Shapes Paper • 2404.17569 • Published 14 days ago • 10
PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning Paper • 2404.16994 • Published 15 days ago • 30
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving Paper • 2404.16771 • Published 15 days ago • 16
Interactive3D: Create What You Want by Interactive 3D Generation Paper • 2404.16510 • Published 16 days ago • 17
MaGGIe: Masked Guided Gradual Human Instance Matting Paper • 2404.16035 • Published 16 days ago • 8
Editable Image Elements for Controllable Synthesis Paper • 2404.16029 • Published 16 days ago • 9
PuLID: Pure and Lightning ID Customization via Contrastive Alignment Paper • 2404.16022 • Published 16 days ago • 16
The Big Benchmarks Collection Collection Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) • 12 items • Updated Feb 7 • 74
ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance Paper • 2403.12409 • Published Mar 19 • 9
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models Paper • 2404.14507 • Published 18 days ago • 21
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published 19 days ago • 228
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis Paper • 2404.13686 • Published 20 days ago • 25
MultiBooth: Towards Generating All Your Concepts in an Image from Text Paper • 2404.14239 • Published 19 days ago • 7
PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation Paper • 2404.13026 • Published 21 days ago • 21
Does Gaussian Splatting need SFM Initialization? Paper • 2404.12547 • Published 22 days ago • 8
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors Paper • 2310.12190 • Published Oct 18, 2023 • 7
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing Paper • 2404.12253 • Published 23 days ago • 49
MeshLRM: Large Reconstruction Model for High-Quality Mesh Paper • 2404.12385 • Published 22 days ago • 23
MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation Paper • 2404.11565 • Published 23 days ago • 12
Long-form music generation with latent diffusion Paper • 2404.10301 • Published 25 days ago • 22
Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video Paper • 2404.09833 • Published 26 days ago • 27
Adapting LLaMA Decoder to Vision Transformer Paper • 2404.06773 • Published about 1 month ago • 13
On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation Paper • 2404.08540 • Published 29 days ago • 10
Probing the 3D Awareness of Visual Foundation Models Paper • 2404.08636 • Published 28 days ago • 11
Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies Paper • 2404.08197 • Published 29 days ago • 26
TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts Paper • 2401.14828 • Published Jan 26 • 6
Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models Paper • 2404.07724 • Published 30 days ago • 10
Best Practices and Lessons Learned on Synthetic Data for Language Models Paper • 2404.07503 • Published 30 days ago • 25
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback Paper • 2404.07987 • Published 29 days ago • 45
Transferable and Principled Efficiency for Open-Vocabulary Segmentation Paper • 2404.07448 • Published about 1 month ago • 8
Urban Architect: Steerable 3D Urban Scene Generation with Layout Prior Paper • 2404.06780 • Published about 1 month ago • 9
RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion Paper • 2404.07199 • Published about 1 month ago • 21
view article Article DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive By bpan • Apr 9 • 26
Magic-Boost: Boost 3D Generation with Mutli-View Conditioned Diffusion Paper • 2404.06429 • Published Apr 9 • 6