Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.04445

Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

Paper • 2411.06558 • Published Nov 10 • 34
SlimLM: An Efficient Small Language Model for On-Device Document Assistance

Paper • 2411.09944 • Published 27 days ago • 12
Look Every Frame All at Once: Video-Ma^2mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing

Paper • 2411.19460 • Published 13 days ago • 10
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published 6 days ago • 39

about 1 hour ago

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21 • 56
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17 • 51
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 41
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20 • 52

Perception and abstraction. Each modality is tokenized and embedded into vectors for model to comprehend.

about 10 hours ago

VILA^2: VILA Augmented VILA

Paper • 2407.17453 • Published Jul 24 • 39
Octopus v4: Graph of language models

Paper • 2404.19296 • Published Apr 30 • 116
Octo-planner: On-device Language Model for Planner-Action Agents

Paper • 2406.18082 • Published Jun 26 • 47
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models

Paper • 2408.15518 • Published Aug 28 • 42

RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots

Paper • 2406.02523 • Published Jun 4 • 10
UniT: Unified Tactile Representation for Robot Learning

Paper • 2408.06481 • Published Aug 12 • 9
Latent Action Pretraining from Videos

Paper • 2410.11758 • Published Oct 15 • 2
Neural Fields in Robotics: A Survey

Paper • 2410.20220 • Published Oct 26 • 4

Learning from feedback dir

Suppressing Pink Elephants with Direct Principle Feedback

Paper • 2402.07896 • Published Feb 12 • 9
Policy Improvement using Language Feedback Models

Paper • 2402.07876 • Published Feb 12 • 5
Direct Language Model Alignment from Online AI Feedback

Paper • 2402.04792 • Published Feb 7 • 29
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2 • 64

GRUtopia: Dream General Robots in a City at Scale

Paper • 2407.10943 • Published Jul 15 • 23
Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion

Paper • 2407.10973 • Published Jul 15 • 9
Cross Anything: General Quadruped Robot Navigation through Complex Terrains

Paper • 2407.16412 • Published Jul 23 • 5
RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands

Paper • 2408.11048 • Published Aug 20 • 4

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs