view article Article Enjoy the Power of Phi-3 with ONNX Runtime on your device By Emma-N • 1 day ago • 16
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 20 items • Updated about 24 hours ago • 263
view article Article Exploration of Job Application Automation with Data Scraping By herooooooooo • 13 days ago • 3
view article Article Glaze and the Effectiveness of Anti-AI Methods for Diffusion Models By parsee-mizuhashi • 8 days ago • 3
view article Article Synthetic dataset generation techniques: Self-Instruct By davanstrien • 8 days ago • 5
LlamaForTokenClassification Collection Fine Tuned llama variants for Token Classification • 6 items • Updated 10 days ago • 2
Terminus XL Collection v-prediction SDXL clone with zero-terminal SNR noise schedule • 8 items • Updated 29 days ago • 5
view article Article Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task By danaaubakirova • 7 days ago • 15
view article Article Evalverse: Revolutionizing Large Language Model Evaluation with a Unified, User-Friendly Framework By Yescia • 16 days ago • 1
view article Article Advancing Open-source Large Language Models in the Medical & Healthcare Domain By aaditya • 13 days ago • 4
view article Article Adapt custom AI models to the trainer API and to 🤗 By not-lain • 9 days ago • 15
view article Article Knowledge Distillation for Fine-Tuning a GPT-3.5 Judge: Enhancing Accuracy and Performance By Andyrasika • 10 days ago • 4
view article Article SeeMoE: Implementing a MoE Vision Language Model from Scratch By AviSoori1x • 17 days ago • 24
view article Article Can we create pedagogically valuable multi-turn synthetic datasets from Cosmopedia? By davanstrien • 16 days ago • 6
view article Article Train Custom Models on Hugging Face Spaces with AutoTrain SpaceRunner By abhishek • 14 days ago • 6
view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch By AviSoori1x • 16 days ago • 23
view article Article Expanding Model Context and Creating Chat Models with a Single Click By maywell • 25 days ago • 33
view article Article ⚗️ 🧑🏼🌾 Let's grow some Domain Specific Datasets together By burtenshaw • 24 days ago • 26
view article Article A Guide to Designing New Functional Proteins and Improving Protein Function, Stability, and Diversity with Generative AI By AmelieSchreiber • 9 days ago • 16
view article Article Fish Speech V1 - New Multilingual Open Source TTS Model By lengyue233 • 20 days ago • 4
view article Article Token Merging for fast LLM inference : Background and first trials with Mistral By samchain • 23 days ago • 1
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation 24 days ago • 69
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram • 29 days ago • 45
view article Article Fine Tuning a LLM Using Kubernetes with Intel® Xeon® Scalable Processors By dmsuehir • 29 days ago • 2
view article Article seemore: Implement a Vision Language Model from Scratch By AviSoori1x • 11 days ago • 41
view article Article Post-OCR-Correction: 1 billion words dataset of automated OCR correction by LLM By Pclanglais • 27 days ago • 10
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • 27 days ago • 55
view article Article Estimating Memory Consumption of LLMs for Inference and Fine-Tuning for Cohere Command-R+ By Andyrasika • 27 days ago • 6
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Apr 18 • 534
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Apr 22 • 71
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published about 1 month ago • 235
view article Article Releasing Youtube-Commons: a massive open corpus for conversational and multimodal data By Pclanglais • Apr 18 • 20
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 57
view article Article Ryght’s Journey to Empower Healthcare and Life Sciences with Expert Support from Hugging Face Apr 16 • 6
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models Paper • 2404.02258 • Published Apr 2 • 101
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent Paper • 2404.03648 • Published Apr 4 • 22
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 129
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Paper • 2307.01952 • Published Jul 4, 2023 • 74
ClassPruning: Speed Up Image Restoration Networks by Dynamic N:M Pruning Paper • 2211.05488 • Published Nov 10, 2022 • 1
Multi-Curve Translator for High-Resolution Photorealistic Image Translation Paper • 2203.07756 • Published Mar 15, 2022 • 1
Modular Degradation Simulation and Restoration for Under-Display Camera Paper • 2209.11455 • Published Sep 23, 2022 • 1
StarEnhancer: Learning Real-Time and Style-Aware Image Enhancement Paper • 2107.12898 • Published Jul 27, 2021 • 2
Rethinking Performance Gains in Image Dehazing Networks Paper • 2209.11448 • Published Sep 23, 2022 • 1