Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

prithivMLmods 
posted an update 2 days ago
view post
Post
2830
The demo of Qwen3-VL-30B-A3B-Instruct, the next-generation and powerful vision-language model in the Qwen series, delivers comprehensive upgrades across the board — including superior text understanding and generation, deeper visual perception and reasoning, extended context length, enhanced spatial and video dynamics comprehension, and stronger agent interaction capabilities. 🤗🔥

⚡ Space / App: prithivMLmods/Qwen3-VL-HF-Demo

The model’s demo supports a wide range of tasks, including;
Image Inference, Video Inference, PDF Inference, Image Captioning (VLA), GIF Inference.

⚡ Collection: prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0

Thanks for granting the blazing-fast Zero GPU access, @merve 🙏

⚡ Other Pages

> Github: https://github.com/prithivsakthiur/qwen3-vl-hf-demo
> Multimodal VLMs July'25 : prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027
> VL caption — < Sep 15 ’25 : prithivMLmods/vl-caption-sep-15-25-68c7f6d737985c63c13e2391
> Multimodal VLMs - Aug'25 : prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd

To know more about it, visit the app page or the respective model page!!
Kseniase 
posted an update 1 day ago
view post
Post
1550
9 Powerful AI Video Generation Tools

Since Sora 2 is on fire these weeks, reminding us what high-quality video generation should look like, we decided you really need this list of video generation tools – great alternatives or complements to it.

1. Sora 2 → https://openai.com/sora/
It needs no introduction, but this OpenAI’s text-to-video model produces short, ultra-realistic clips across styles (cinematic, photorealistic, animated, etc.) with synced audio

2. Google Veo 3 (Gemini Video Generation) → https://aistudio.google.com/models/veo-3
Part of Gemini AI. Generates 8-second high-fidelity videos from text or images with native sound: background soundtracks and realistic voices with near-perfect lip sync

3. Runway (Gen-4 by Runway ML) → https://runwayml.com/
Text, image, or video-to-video generation with advanced editing like changing lighting, weather, camera angles or replacing objects. Popular in AI filmmaking

4. Pika Labs → https://pollo.ai/m/pika-ai
Provides creative, often stylized short videos – from cinematic mini-scenes to cartoon-like animations. Ideal for social media clips and visual storytelling. Plus, you can add playful effects to manipulate objects in the generated videos

5. Luma’s Dream Machine → https://lumalabs.ai/dream-machine
Powered by Luma AI’s latest Ray 3 model, it quickly visualizes story ideas, animated concept art, or abstract motion videos. It supports consistent custom characters and seamless looping

Read further below ⬇️
If you like it, also subscribe to the Turing Post https://www.turingpost.com/subscribe
  • 1 reply
·
YerbaPage 
posted an update 2 days ago
MonsterMMORPG 
posted an update 3 days ago
view post
Post
2963
Ovi is Local Version of VEO 3 & SORA 2 - The first-ever public, open-source model that generates both VIDEO and synchronized AUDIO, and you can run it on your own computer on Windows even with a 6GB GPUs - Full Tutorial for Windows, RunPod and Massed Compute - Gradio App > https://youtu.be/T00VmkMQRPQ



Tutorial : https://youtu.be/T00VmkMQRPQ

Forget waiting lists and expensive APIs. The era of closed-off, corporate-controlled AI video generation is soon over. This is Ovi : The first-ever public, open-source model that generates both VIDEO and synchronized AUDIO, and you can run it on your own computer—even with a 6GB GPU! This isn't just a demo; it's a full, step-by-step revolution.

Tutorial Info
In this ultimate A-Z guide, I'll show you EVERYTHING you need to know to install and master this Sora 2 and VEO3 like AI. We'll go from zero to generating incredible talking videos from text or a single image.

🔥 In This Tutorial, You Will Learn To:
🎓 Master the Ultimate SORA 2 and VEO 3 Alternative: The first true open-source challenger to OpenAI & Google.
💻 Run on Low-Spec Hardware: We've optimized this to run on GPUs with as little as 6GB of VRAM!
💸 Generate for FREE: No credits, no subscriptions. Run it locally on Windows or cheaply in the cloud.
🗣️ Create Synced Audio & Video: Go beyond silent movies. Make your characters speak with perfect lip-sync.
☁️ Install ANYWHERE: Complete one-click install guides for Windows, MassCompute, and RunPod.
🖼️ Animate Any Image: Bring your static images to life with stunning animation and speech.
🚀 Unlock Pro Features: Dive deep into batch processing, video extensions, LoRA support, and advanced optimizations.
404Zen 
posted an update about 18 hours ago
view post
Post
1110
4 must-try AI video models in 2026 — all in one place on iMini! 🎬✨
Featuring Sora 2, Veo 3, Wan 2.5, and Seedance 3.0 — no invite code, no watermark!
Try it now 👉 https://imini.com/
Monica997 
posted an update about 24 hours ago
view post
Post
1127
You think those playful puppies are real? 🐶✨
Nope! It’s a video I created using iMini’s newly integrated Sora 2 model — no invite code, no watermark, just one simple text prompt to generate dynamic videos in seconds! 🎬

Limited-time offer: members can create without using credits!
👉 Try it now: https://imini.com/
unmodeled-tyler 
posted an update 1 day ago
view post
Post
1188
vanta-research/apollo-astralis-8b

I ran the same prompt sequence on my model Apollo Astralis 8B and Hermes4 14B from Nous Research.. The raw chat logs were then given to 3 different architectures (DeepSeek 3.1, LLaMA 405B, GPT-OSS 120B). All 3 models were given the same, simple instructions to analyze the logs and determine which model performed better.

All 3 independently chose Astralis 8B for stronger reasoning, alignment, transparency, and collaborative language.

Astralis 8B is designed to keep you motivated by applying warm collaborative language mixed with rigorous logical reasoning and problem solving capabilities.

Give Astralis a try!
  • 2 replies
·
kanaria007 
posted an update 1 day ago
view post
Post
1312
✅ New Article: *Procrastination as a Structural Loop*

Title:
⏳ Procrastination as a Structural Loop: Why “I Know, But I Don’t Act” Persists — and How Protocols Contain It
🔗 https://huggingface.co/blog/kanaria007/procrastination-as-structural-loop

---

Summary:
Procrastination isn’t laziness — it’s a *miswired loop*.
When task threat and reward prediction skew the thresholds, the *jump-generator* routes to *delay*, *reflexia* amplifies avoidance, and the *memory-loop* reinforces “not now.”
Once we treat it as structure, we can *reindex costs, lower entry friction, and relink intention to execution*.

> Motivation wavers.
> *Loops can be rewired.*

---

Why It Matters:
• Reframes procrastination as *auditable mechanics*, not a moral flaw
• Turns “stuck” into a stepwise *rollback → small jump → stable loop*
• Applies from individuals to teams (deadlines, sprints, reviews)

---

What’s Inside:
• The Procrastination Loop: trigger → aversion tag → avoidance jump → guilt overlay → recurrence
• Fix-kit: *parse-guard* (dread vs. scope), *micro-commit* (≤2-min first jump), *visible reward tags*, *timeboxing as rollback*
• Anti-spiral design: shrink surfaces, batch uncertainty, pre-commit environments
• Metrics to watch: start-latency, jump-success rate, recovery half-life

---

📖 *Human Observation Series — Article 8*
Where “Envy” mapped comparison loops, this entry *reconnects knowing and doing* with structural guardrails.

---

Next: *Fear & Anxiety as Structured Loops*

> From delay to dread,
> *structure turns feelings into fixable loops.*
AbstractPhil 
posted an update 2 days ago
view post
Post
1006
David + Imagenet = high% val.
AbstractPhil/gated-david
https://github.com/AbstractEyes/lattice_vocabulary/blob/master/src/geovocab2/train/model/core/david.py

David's code has been released. I am currently setting up a trainer and will release the process on how to condition David to behave. This isn't the easiest process, but it's necessary to run David on a curriculum rather than simply feeding the model with cross-entropy and hoping for the best.

David's internals involve a clock mechanism that allows direct control of David's freeze/unfreeze mechanisms at runtime - allowing for many opinions to be generated simultaneously.

David is multiple models in one, not just one - and yet David is single-shot oriented. The prototype to the route of thought that led me to find the Cantor's Stairs positional encodings solution and the prototype to ViT-Zana, ViT-Beatrix, ViT-Beatrix-Dual-Block, and today the direct porting of David's complex architecture and the process to train David has begun.

David is... a gate of sorts. David trains with freeze/unfreeze mechanisms, so the internals of David's structures are aware during training time which part is more important than the other parts based on the quality of generation.

David can handle imagenet features with minimal hassle of many variations, and the primary trainer will include direct links to the prepared imagenet features, and a simple generation system that allows you to generate your own features from a few common AIs - one of which will be vit-beatrix-dualstream trained on imagenet.

As of posting vit-beatrix and vit-beatrix-dualstream require some face-lifting and a refined version 2 to incorporate the more accurate batched cantor stairs equations. Additionally they require removal of some fail-point causers; like flow-geometric introducing bias towards seemingly unnecessary trajectory routes. This points more to a gradient drift, so I'll keep that one on the hot plate until it's ready.
  • 2 replies
·
sondhiArm 
posted an update 3 days ago
view post
Post
1355
Arm will be @ PyTorch Conference, Join Us!

Join us on site October 22-23 to see how Arm empowers developers to build and deploy AI applications with ease using PyTorch and ExecuTorch. Learn about the latest AI technologies from Arm and our ecosystem while expanding your professional network alongside like-minded AI engineers.

Learn more here:
https://huggingface.co/blog/Arm/arm-at-pytorch-conference