ViDoRe Benchmark Collection Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the QA format. • 10 items • Updated 16 days ago • 11
view article Article Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth By mlabonne • Jul 29 • 249
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize • 6 items • Updated Jul 21 • 64
view article Article Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models Mar 20 • 67
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Sep 25 • 685
Saul-7B: A pioneering Large Language Model for Law Collection We introduce SaulLM-7B, a LLM tailored for the legal domain trained on 30 billion tokens of legal data. Released under MIT License. • 4 items • Updated Mar 7 • 18
read papers Collection This is a collection of some papers I've read in the past few months • 10 items • Updated Nov 21, 2023 • 47