view article Article Docmatix - a huge dataset for Document Visual Question Answering 4 days ago • 43
DataComp-LM: In search of the next generation of training sets for language models Paper • 2406.11794 • Published Jun 17 • 45
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published 27 days ago • 75
view article Article Ethics and Society Newsletter #6: Building Better AI: The Importance of Data Quality 28 days ago • 28
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models 28 days ago • 142
view article Article BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks Jun 18 • 32
C4AI Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated May 23 • 40
GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks Paper • 2406.12925 • Published Jun 14 • 20
CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models Paper • 2405.13974 • Published May 22 • 8
Tulu V2.5 Suite Collection A suite of models trained using DPO and PPO across a wide variety (up to 14) of preference datasets. See https://arxiv.org/abs/2406.09279 for more! • 41 items • Updated Jun 14 • 9
view article Article Reports on the Hub: A First Look at Self-governance in Open Source AI Development By frimelle • Jun 12 • 7
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 29 items • Updated Jun 6 • 259
MagicPose4D: Crafting Articulated Models with Appearance and Motion Control Paper • 2405.14017 • Published May 22 • 2
Flash Diffusion Collection Collection of models distilled using the method proposed in Flash Diffusion paper • 7 items • Updated Jun 18 • 13
IrokoBench Collection a human-translated benchmark dataset for 16 African languages covering three tasks: NLI, MMLU and MGSM • 6 items • Updated May 31 • 16
view article Article Introducing NPC-Playground, a 3D playground to interact with LLM-powered NPCs Jun 5 • 14
LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models Paper • 2308.11462 • Published Aug 20, 2023 • 2
In-Context Prompt Editing For Conditional Audio Generation Paper • 2311.00895 • Published Nov 1, 2023 • 9
view article Article CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models May 24 • 21
OmniGlue: Generalizable Feature Matching with Foundation Model Guidance Paper • 2405.12979 • Published May 21 • 9
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention Paper • 2405.12981 • Published May 21 • 26
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control Paper • 2405.12970 • Published May 21 • 22
Diffusion for World Modeling: Visual Details Matter in Atari Paper • 2405.12399 • Published May 20 • 25
🚀GGUF Collection Llama.cpp compatible models, can be used on CPUs and GPUs! • 679 items • Updated about 19 hours ago • 28
INDUS: Effective and Efficient Language Models for Scientific Applications Paper • 2405.10725 • Published May 17 • 30
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated 25 days ago • 124
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 24 items • Updated 3 days ago • 146
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published Apr 29 • 116
LLM-AD: Large Language Model based Audio Description System Paper • 2405.00983 • Published May 2 • 15
FLAME: Factuality-Aware Alignment for Large Language Models Paper • 2405.01525 • Published May 2 • 23
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment Paper • 2405.01481 • Published May 2 • 22