view article Article Perspectives for first principles prompt engineering By KnutJaegersberg • 6 days ago • 16
view article Article dstack: Your LLM Launchpad - From Fine-Tuning to Serving, Simplified By chansung • 3 days ago • 12
view article Article Self-Hosting LLaMA 3.1 70B (or any ~70B LLM) Affordably By abhinand • 4 days ago • 1
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • 5 days ago • 56
view article Article ∞🧙🏼♂️AnyClassifier - Generating Synthetic Data For Text Classification By kenhktsui • 5 days ago • 6
view article Article Outperforming Claude 3.5 Sonnet with Phi-3-mini-4k for graph entity relationship extraction tasks By rcaulk • 5 days ago • 5
view article Article I Trained a 2D Game Animation Generation Model to Create Complex, Cool Game Actions (Fully Open-Source) By lyogavin • 6 days ago • 4
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 • 160
view article Article The case for specialized pre-training: ultra-fast foundation models for dedicated tasks By Pclanglais • 20 days ago • 24
view article Article Agentic Task Delegation - Making Agents whole again By adarshxs • 19 days ago • 3
view article Article ArabicWeb24: Creating a High Quality Arabic Web-only Pre-training Dataset By MayFarhat • 16 days ago • 8
view article Article Batch size 30 AdamW vs Batch Size 1 Adafactor SDXL Training Comparison By MonsterMMORPG • 16 days ago • 2
view article Article Unlocking Creativity with Text-to-Image Generation: Exploring LoRA Models and Styles By prithivMLmods • 16 days ago • 7
MambaVision Collection MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes tiny, tiny2, small, base, large and large2 variants. • 8 items • Updated Jul 24 • 11
Gemma 2: Improving Open Language Models at a Practical Size Paper • 2408.00118 • Published 24 days ago • 72
view article Article Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model Aug 22, 2023 • 18
view article Article Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth By mlabonne • 26 days ago • 164
view article Article 🔥 Argilla 2.0: the data-centric tool for AI makers 🤗 By dvilasuero • 25 days ago • 31
view article Article LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning? about 1 month ago • 17
view article Article ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models By yuchenlin • 28 days ago • 21
view article Article Are We Ready for Multi-Image Reasoning? Launching VHs: The Visual Haystacks Benchmark! By davidchan • Jul 23 • 2
view article Article Taxonomy Completion with Embedding Quantization and an LLM-based Pipeline: A Case Study in Computational Linguistics By dcarpintero • Jul 22 • 3
view article Article Experimenting with Automatic PII Detection on the Hub using Presidio Jul 10 • 23
view article Article Enhancing Search Capabilities for Non-English Datasets in the Dataset Viewer By asoria • Jul 10 • 4
view article Article MInference 1.0: 10x Faster Million Context Inference with a Single GPU By liyucheng • Jul 11 • 10
view article Article RegMix: Data Mixture as Regression for Language Model Pre-training By SivilTaram • Jul 11 • 8