view article Article Indexify: Bringing HuggingFace Models to Real-Time Pipelines for Production Applications By rishiraj • 1 day ago • 4
view article Article ⚗️ 🔥 Building High-Quality Datasets with distilabel and Prometheus 2 By burtenshaw • 3 days ago • 20
view article Article How to Fine-Tune Custom Embedding Models Using AutoTrain By abhishek • 2 days ago • 9
view article Article Orchestration of Experts: The First-Principle Multi-Model System By alirezamsh • 2 days ago • 13
Yuan 2.0-M32: Mixture of Experts with Attention Router Paper • 2405.17976 • Published 4 days ago • 15
NeRF-Casting: Improved View-Dependent Appearance with Consistent Reflections Paper • 2405.14871 • Published 9 days ago • 6
LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models Paper • 2405.14477 • Published 9 days ago • 14
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data Paper • 2405.14333 • Published 9 days ago • 27
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach Paper • 2405.15613 • Published 8 days ago • 11
LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters Paper • 2405.16287 • Published 7 days ago • 9
Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer Paper • 2405.17405 • Published 5 days ago • 12
Part123: Part-aware 3D Reconstruction from a Single-view Image Paper • 2405.16888 • Published 6 days ago • 10
Trans-LoRA: towards data-free Transferable Parameter Efficient Finetuning Paper • 2405.17258 • Published 5 days ago • 11
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models Paper • 2405.17428 • Published 5 days ago • 12
Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels Paper • 2405.16822 • Published 6 days ago • 10
Looking Backward: Streaming Video-to-Video Translation with Feature Banks Paper • 2405.15757 • Published 8 days ago • 11
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models Paper • 2405.16537 • Published 6 days ago • 14
Transformers Can Do Arithmetic with the Right Embeddings Paper • 2405.17399 • Published 5 days ago • 44
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 5 days ago • 59
view article Article Introducing Transformers Agent 2.0: A Leap Forward in Intelligent Automation By Andyrasika • 5 days ago • 6
DenseConnector Collection Official collection of "Dense Connector for MLLMs" • 4 items • Updated 4 days ago • 1
AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Paper • 2405.14906 • Published 10 days ago • 18
Data Mixing Made Efficient: A Bivariate Scaling Law for Language Model Pretraining Paper • 2405.14908 • Published 9 days ago • 10
CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner Paper • 2405.14979 • Published 9 days ago • 13
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training Paper • 2405.15319 • Published 8 days ago • 19
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization Paper • 2405.15071 • Published 9 days ago • 30
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models Paper • 2405.15574 • Published 8 days ago • 45
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models Paper • 2405.15738 • Published 8 days ago • 41
Aya 23: Open Weight Releases to Further Multilingual Progress Paper • 2405.15032 • Published 9 days ago • 21
view article Article GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing By NicoNico • 8 days ago • 9
view article Article Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages 9 days ago • 12
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts Paper • 2405.11273 • Published 14 days ago • 15
C4AI Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated 9 days ago • 34
The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models Paper • 2404.16019 • Published Apr 24 • 1
view article Article Introducing Spaces Dev Mode for a seamless developer experience 12 days ago • 10
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated 2 days ago • 299
Diffusion for World Modeling: Visual Details Matter in Atari Paper • 2405.12399 • Published 12 days ago • 25
Personalized Residuals for Concept-Driven Text-to-Image Generation Paper • 2405.12978 • Published 11 days ago • 8
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control Paper • 2405.12970 • Published 11 days ago • 20
OmniGlue: Generalizable Feature Matching with Foundation Model Guidance Paper • 2405.12979 • Published 11 days ago • 7
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention Paper • 2405.12981 • Published 11 days ago • 23
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization Paper • 2405.11582 • Published 13 days ago • 10