C4AI Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated about 2 hours ago • 11
The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models Paper • 2404.16019 • Published 29 days ago • 1
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 20 items • Updated 1 day ago • 266
Diffusion for World Modeling: Visual Details Matter in Atari Paper • 2405.12399 • Published 3 days ago • 19
Personalized Residuals for Concept-Driven Text-to-Image Generation Paper • 2405.12978 • Published 2 days ago • 6
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control Paper • 2405.12970 • Published 2 days ago • 15
OmniGlue: Generalizable Feature Matching with Foundation Model Guidance Paper • 2405.12979 • Published 2 days ago • 6
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention Paper • 2405.12981 • Published 2 days ago • 13
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization Paper • 2405.11582 • Published 4 days ago • 9
Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching Paper • 2405.11252 • Published 5 days ago • 10
Towards Modular LLMs by Building and Reusing a Library of LoRAs Paper • 2405.11157 • Published 6 days ago • 17
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework Paper • 2405.11143 • Published 4 days ago • 27
Imp-v1.5 Collection A series of Imp models with different LLM backbone. • 5 items • Updated 2 days ago • 3
Imp: Highly Capable Large Multimodal Models for Mobile Devices Paper • 2405.12107 • Published 3 days ago • 19
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper • 2405.12130 • Published 3 days ago • 35
FIFO-Diffusion: Generating Infinite Videos from Text without Training Paper • 2405.11473 • Published 4 days ago • 42
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram • 29 days ago • 45
INDUS: Effective and Efficient Language Models for Scientific Applications Paper • 2405.10725 • Published 6 days ago • 15
view article Article Enjoy the Power of Phi-3 with ONNX Runtime on your device By Emma-N • 1 day ago • 16
Observational Scaling Laws and the Predictability of Language Model Performance Paper • 2405.10938 • Published 6 days ago • 9
view article Article Decoding GPT-4'o': In-Depth Exploration of Its Mechanisms and Creating Similar AI. By KingNish • 2 days ago • 14
HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach Paper • 2404.01094 • Published Apr 1 • 4
Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion Paper • 2405.09874 • Published 7 days ago • 14
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection Paper • 2405.10300 • Published 7 days ago • 21
CAT3D: Create Anything in 3D with Multi-View Diffusion Models Paper • 2405.10314 • Published 7 days ago • 33
Many-Shot In-Context Learning in Multimodal Foundation Models Paper • 2405.09798 • Published 8 days ago • 24
view article Article Train custom AI models with the trainer API and adapt them to 🤗 By not-lain • about 5 hours ago • 15
view article Article Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task By danaaubakirova • 7 days ago • 15
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation Paper • 2405.09546 • Published 8 days ago • 8
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model Paper • 2405.09215 • Published 8 days ago • 13
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models Paper • 2405.09220 • Published 8 days ago • 22
SpeechVerse: A Large-scale Generalizable Audio Language Model Paper • 2405.08295 • Published 10 days ago • 10
SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models Paper • 2405.08317 • Published 9 days ago • 8
Understanding the performance gap between online and offline alignment algorithms Paper • 2405.08448 • Published 9 days ago • 11
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding Paper • 2405.08344 • Published 9 days ago • 10
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory Paper • 2405.08707 • Published 9 days ago • 25
Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning Paper • 2405.08054 • Published 10 days ago • 18
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding Paper • 2405.08748 • Published 9 days ago • 17
VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models Paper • 2403.06098 • Published Mar 10 • 15
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 50 items • Updated 2 days ago • 12
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published 21 days ago • 96
MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels Paper • 2405.07526 • Published 10 days ago • 14
Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training Paper • 2405.06932 • Published 12 days ago • 15
LogoMotion: Visually Grounded Code Generation for Content-Aware Animation Paper • 2405.07065 • Published 12 days ago • 15
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots Paper • 2405.07990 • Published 10 days ago • 15
SUTRA: Scalable Multilingual Language Model Architecture Paper • 2405.06694 • Published 16 days ago • 34
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 11 items • Updated 6 days ago • 97