Philipp Schmid's picture

Philipp Schmid

philschmid

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago
Phi-4 Technical Report
updated a collection 4 days ago
LLM Reasoning Papers
updated a dataset 6 days ago
amazon-sagemaker/repository-metadata
View all activity

Articles

Organizations

Hugging Face's profile picture AWS Inferentia and Trainium's profile picture Text Generation Inference's profile picture Amazon SageMaker Community's profile picture Hugging Face Infinity's profile picture GermanT5's profile picture trl internal testing's profile picture Hugging Face Optimum's profile picture Hugging Test Lab's profile picture Libre Euro Lingua-Alliance's profile picture Language Tools's profile picture Hugging Face H4's profile picture Inference Endpoints's profile picture HuggingFace Doc Builds's profile picture Hugging Face Extreme-Scale's profile picture Hugging Face H4 Community's profile picture Amazon SageMaker's profile picture Code Llama's profile picture Phind's profile picture DeepLearning AI courses's profile picture H4 Alignment Handbook's profile picture GNR8's profile picture gg-hf's profile picture ORPO Explorers's profile picture Social Post Explorers's profile picture Zeitgeist's profile picture hsramall's profile picture gg-tt's profile picture Hugging Face Machine Learning Optimization's profile picture LLHF's profile picture Hugging Quants's profile picture blhf's profile picture Meta Llama's profile picture Google Cloud 🤝🏻 Hugging Face's profile picture Huggingface HUGS's profile picture

Posts 2

view post
Post
6812
New state-of-the-art open LLM! 🚀 Databricks just released DBRX, a 132B MoE trained on 12T tokens. Claiming to surpass OpenAI GPT-3.5 and is competitive with Google Gemini 1.0 Pro. 🤯

TL;DR
🧮 132B MoE with 16 experts with 4 active in generation
🪟 32 000 context window
📈 Outperforms open LLMs on common benchmarks, including MMLU
🚀 Up to 2x faster inference than Llama 2 70B
💻 Trained on 12T tokens
🔡 Uses the GPT-4 tokenizer
📜 Custom License, commercially useable

Collection: databricks/dbrx-6601c0852a0cdd3c59f71962
Demo: databricks/dbrx-instruct

Kudos to the Team at Databricks and MosaicML for this strong release in the open community! 🤗
view post
Post
What's the best way to fine-tune open LLMs in 2024? Look no further! 👀 I am excited to share “How to Fine-Tune LLMs in 2024 with Hugging Face” using the latest research techniques, including Flash Attention, Q-LoRA, OpenAI dataset formats (messages), ChatML, Packing, all built with Hugging Face TRL. 🚀

It is created for consumer-size GPUs (24GB) covering the full end-to-end lifecycle with:
💡Define and understand use cases for fine-tuning
🧑🏻‍💻 Setup of the development environment
🧮 Create and prepare dataset (OpenAI format)
🏋️‍♀️ Fine-tune LLM using TRL and the SFTTrainer
🥇 Test and evaluate the LLM
🚀 Deploy for production with TGI

👉  https://www.philschmid.de/fine-tune-llms-in-2024-with-trl

Coming soon: Advanced Guides for multi-GPU/multi-Node full fine-tuning and alignment using DPO & KTO. 🔜