view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • 1 day ago • 29
StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments Paper • 2401.04290 • Published Jan 9 • 3
Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding Paper • 2401.04575 • Published Jan 9 • 14
AeroPath: An airway segmentation benchmark dataset with challenging pathology Paper • 2311.01138 • Published Nov 2, 2023 • 5
RadioGalaxyNET: Dataset and Novel Computer Vision Algorithms for the Detection of Extended Radio Galaxies and Infrared Hosts Paper • 2312.00306 • Published Dec 1, 2023 • 2
SynFundus: Generating a synthetic fundus images dataset with millions of samples and multi-disease annotations Paper • 2312.00377 • Published Dec 1, 2023 • 2
Enhancing Visually-Rich Document Understanding via Layout Structure Modeling Paper • 2308.07777 • Published Aug 15, 2023 • 2
smol models Collection Models where the size of the model file (model.safetensors or pytorch_model.bin) < 50mb • 58 items • Updated Oct 6, 2023 • 6
Domain specific data and model documentation Collection There is a growing number of datasheets or model card frameworks being proposed for particular domains. This collection tries to capture some of these • 5 items • Updated Oct 5, 2023 • 2