view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 6 days ago • 62
AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Paper • 2405.14906 • Published 11 days ago • 18
Sparse Foundational Llama 2 Models Collection Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras • 27 items • Updated 16 days ago • 7
C4AI Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated 10 days ago • 34
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated 3 days ago • 301
IndicGenBench Collection Datasets released in "IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs" (https://arxiv.org/abs/2404.16816) • 4 items • Updated 19 days ago • 2
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM Paper • 2401.02994 • Published Jan 4 • 44
WildChat: 1M ChatGPT Interaction Logs in the Wild Paper • 2405.01470 • Published about 1 month ago • 53
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation Apr 29 • 69
LLaVA++ (LLaMA-3 and Phi-3-Mini) Collection Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 • 11 items • Updated 38 minutes ago • 22
view article Article seemore: Implement a Vision Language Model from Scratch By AviSoori1x • 21 days ago • 44
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 134
Quantized-FT-Orca-Math Collection Models trained during quantization aware fine-tuning experiments using PyTorch's FSDP. • 8 items • Updated Apr 16 • 7
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Paper • 2404.05719 • Published Apr 8 • 57
Eurus Collection Advancing LLM Reasoning Generalists with Preference Trees • 11 items • Updated Apr 15 • 22
UDOP Collection UDOP is a general multimodal model for document AI • 4 items • Updated 11 days ago • 20
Aya Indic Suite Collection An Indic language filtered dataset from the Aya dataset collection. • 9 items • Updated Mar 31 • 1
StarChat2 15B Collection Model, datasets, and demo for StarChat2 15B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 10 items • Updated Apr 12 • 12
Unifying Vision, Text, and Layout for Universal Document Processing Paper • 2212.02623 • Published Dec 5, 2022 • 10
Design2Code: How Far Are We From Automating Front-End Engineering? Paper • 2403.03163 • Published Mar 5 • 92
OpenMath Collection A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated Feb 19 • 28
⛔️🔦 Provenance, Watermarking & Deepfake Detection Collection Technical tools for more control over non-consensual synthetic content • 14 items • Updated Apr 1 • 36
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution Paper • 2401.03065 • Published Jan 5 • 10
Understanding LLMs: A Comprehensive Overview from Training to Inference Paper • 2401.02038 • Published Jan 4 • 59
ControlLLM: Augment Language Models with Tools by Searching on Graphs Paper • 2310.17796 • Published Oct 26, 2023 • 15
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages Paper • 2309.09400 • Published Sep 17, 2023 • 77
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models Paper • 2308.01825 • Published Aug 3, 2023 • 19