Transformers.js demos Collection A collection of my favorite WebML demos, built with Transformers.js! • 22 items • Updated 8 days ago • 30
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27 • 182
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis Paper • 2402.14797 • Published Feb 22 • 18
InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions Paper • 2402.03040 • Published Feb 5 • 16
StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis Paper • 2401.17093 • Published Jan 30 • 18
SliceGPT: Compress Large Language Models by Deleting Rows and Columns Paper • 2401.15024 • Published Jan 26 • 62
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing Paper • 2312.11392 • Published Dec 18, 2023 • 18
Weight subcloning: direct initialization of transformers using larger pretrained ones Paper • 2312.09299 • Published Dec 14, 2023 • 16
Cache Me if You Can: Accelerating Diffusion Models through Block Caching Paper • 2312.03209 • Published Dec 6, 2023 • 16
Distil-Whisper Models Collection The first version of the Distil-Whisper models released with the Distil-Whisper paper. • 4 items • Updated Mar 21 • 33
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling Paper • 2311.00430 • Published Nov 1, 2023 • 53
Gradio Custom Components ✨ Collection Delightful gradio components created by the community that you can add to you demo! • 4 items • Updated Nov 8, 2023 • 4
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation Paper • 2310.18628 • Published Oct 28, 2023 • 6
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation Paper • 2310.16656 • Published Oct 25, 2023 • 36
SALMONN: Towards Generic Hearing Abilities for Large Language Models Paper • 2310.13289 • Published Oct 20, 2023 • 16
Historical - Spaces of the Week Collection All Spaces of the Week...from all weeks • 636 items • Updated Jan 17 • 18
What the DAAM: Interpreting Stable Diffusion Using Cross Attention Paper • 2210.04885 • Published Oct 10, 2022 • 1
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models Paper • 2310.08659 • Published Oct 12, 2023 • 19
NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions Paper • 2309.15426 • Published Sep 27, 2023 • 14
MagiCapture: High-Resolution Multi-Concept Portrait Customization Paper • 2309.06895 • Published Sep 13, 2023 • 27
Efficient Memory Management for Large Language Model Serving with PagedAttention Paper • 2309.06180 • Published Sep 12, 2023 • 25
Rethinking Vision Transformers for MobileNet Size and Speed Paper • 2212.08059 • Published Dec 15, 2022 • 4
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization Paper • 2303.14189 • Published Mar 24, 2023 • 2
TokenFlow: Consistent Diffusion Features for Consistent Video Editing Paper • 2307.10373 • Published Jul 19, 2023 • 54
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 165
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications Paper • 2306.14289 • Published Jun 25, 2023 • 15