DeepLearning AI courses

AI & ML interests

None defined yet.

Recent Activity

dlaicourses's activity

multimodalart 
posted an update 5 months ago
multimodalart 
posted an update 7 months ago
view post
Post
24848
The first open Stable Diffusion 3-like architecture model is JUST out 💣 - but it is not SD3! 🤔

It is Tencent-Hunyuan/HunyuanDiT by Tencent, a 1.5B parameter DiT (diffusion transformer) text-to-image model 🖼️✨, trained with multi-lingual CLIP + multi-lingual T5 text-encoders for english 🤝 chinese understanding

Try it out by yourself here ▶️ https://huggingface.co/spaces/multimodalart/HunyuanDiT
(a bit too slow as the model is chunky and the research code isn't super optimized for inference speed yet)

In the paper they claim to be SOTA open source based on human preference evaluation!
philschmid 
posted an update 9 months ago
view post
Post
6864
New state-of-the-art open LLM! 🚀 Databricks just released DBRX, a 132B MoE trained on 12T tokens. Claiming to surpass OpenAI GPT-3.5 and is competitive with Google Gemini 1.0 Pro. 🤯

TL;DR
🧮 132B MoE with 16 experts with 4 active in generation
🪟 32 000 context window
📈 Outperforms open LLMs on common benchmarks, including MMLU
🚀 Up to 2x faster inference than Llama 2 70B
💻 Trained on 12T tokens
🔡 Uses the GPT-4 tokenizer
📜 Custom License, commercially useable

Collection: databricks/dbrx-6601c0852a0cdd3c59f71962
Demo: databricks/dbrx-instruct

Kudos to the Team at Databricks and MosaicML for this strong release in the open community! 🤗
·
multimodalart 
posted an update 10 months ago
view post
Post
The Stable Diffusion 3 research paper broken down, including some overlooked details! 📝

Model
📏 2 base model variants mentioned: 2B and 8B sizes

📐 New architecture in all abstraction levels:
- 🔽 UNet; ⬆️ Multimodal Diffusion Transformer, bye cross attention 👋
- 🆕 Rectified flows for the diffusion process
- 🧩 Still a Latent Diffusion Model

📄 3 text-encoders: 2 CLIPs, one T5-XXL; plug-and-play: removing the larger one maintains competitiveness

🗃️ Dataset was deduplicated with SSCD which helped with memorization (no more details about the dataset tho)

Variants
🔁 A DPO fine-tuned model showed great improvement in prompt understanding and aesthetics
✏️ An Instruct Edit 2B model was trained, and learned how to do text-replacement

Results
✅ State of the art in automated evals for composition and prompt understanding
✅ Best win rate in human preference evaluation for prompt understanding, aesthetics and typography (missing some details on how many participants and the design of the experiment)

Paper: https://stabilityai-public-packages.s3.us-west-2.amazonaws.com/Stable+Diffusion+3+Paper.pdf
·
multimodalart 
posted an update 10 months ago
multimodalart 
posted an update 11 months ago
philschmid 
posted an update 11 months ago
view post
Post
What's the best way to fine-tune open LLMs in 2024? Look no further! 👀 I am excited to share “How to Fine-Tune LLMs in 2024 with Hugging Face” using the latest research techniques, including Flash Attention, Q-LoRA, OpenAI dataset formats (messages), ChatML, Packing, all built with Hugging Face TRL. 🚀

It is created for consumer-size GPUs (24GB) covering the full end-to-end lifecycle with:
💡Define and understand use cases for fine-tuning
🧑🏻‍💻 Setup of the development environment
🧮 Create and prepare dataset (OpenAI format)
🏋️‍♀️ Fine-tune LLM using TRL and the SFTTrainer
🥇 Test and evaluate the LLM
🚀 Deploy for production with TGI

👉  https://www.philschmid.de/fine-tune-llms-in-2024-with-trl

Coming soon: Advanced Guides for multi-GPU/multi-Node full fine-tuning and alignment using DPO & KTO. 🔜
·