view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation 2 days ago • 51
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published 8 days ago • 222
Quantized-FT-Orca-Math Collection Models trained during quantization aware fine-tuning experiments using PyTorch's FSDP. • 8 items • Updated 14 days ago • 5
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 12 days ago • 443
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 8 items • Updated 13 days ago • 57
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization Paper • 2402.09320 • Published Feb 14 • 6
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper • 2404.05961 • Published 22 days ago • 61
Antidote Project Collection Data and models generated within the Antidote Project (https://univ-cotedazur.eu/antidote) • 17 items • Updated 5 days ago • 5
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published 20 days ago • 90
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Paper • 2404.07839 • Published 19 days ago • 36
view article Article Making thousands of open LLMs bloom in the Vertex AI Model Garden 21 days ago • 14
Zephyr ORPO Collection Models and datasets to align LLMs with Odds Ratio Preference Optimisation (ORPO). Recipes here: https://github.com/huggingface/alignment-handbook • 3 items • Updated 19 days ago • 13
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild Paper • 2403.16973 • Published Mar 25 • 2
view article Article DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive By bpan • 22 days ago • 25
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models Paper • 2404.05567 • Published 22 days ago • 10
Aurora-M models Collection Aurora-M models (base, biden-harris redteams and instruct) • 5 items • Updated 4 days ago • 15
view article Article Text2SQL using Hugging Face Dataset Viewer API and Motherduck DuckDB-NSQL-7B 27 days ago • 18
Mistral Instruct Merges Collection Merge of Mistral Instruct 1 and 2 using different mergekit techniques • 6 items • Updated Jan 17 • 1
view article Article RAG Empowerment: Cohere C4AI Command-R and Transformers Unveiled By Andyrasika • 23 days ago • 9
PDF Document / OCR Datasets Collection Document datasets with .pdf files that are usable with pixparse libraries and tools. • 2 items • Updated Mar 30 • 36
boulderspot Collection find places to climb outside from aerial imagery • 4 items • Updated 30 days ago • 3
HyperGraph Datasets Collection Collection of HyperGraph Datasets • 17 items • Updated 27 days ago • 7
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 85
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Paper • 2403.18814 • Published Mar 27 • 37
MGM Collection Official model collection for the paper "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models" • 11 items • Updated 9 days ago • 42
Llama2-7B HQQ+ Collection Extreme low-bit quantization with HQQ+ (HQQ + LoRA adapter) • 3 items • Updated 11 days ago • 14
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models Paper • 2403.12881 • Published Mar 19 • 14
ORPO Collection This is the official collection of "ORPO: Monolithic Preference Optimization without Reference Model". • 5 items • Updated 19 days ago • 10
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 52