Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

404Zen 
posted an update 2 days ago
view post
Post
3255
Sora 2 invitation code, share your Sora 2 invitation code!

Also, you can use the imini AI platform directly: https://imini.com/

  • 1 reply
·
MonsterMMORPG 
posted an update 2 days ago
view post
Post
2833
Ovi - Generate Videos With Audio Like VEO 3 or SORA 2 - Run Locally - Open Source for Free

Download and install : https://www.patreon.com/posts/140393220

Quick demo tutorial : https://youtu.be/uE0QabiHmRw

Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation

Project page : https://aaxwaz.github.io/Ovi/

SECourses Ovi Pro Premium App Features

Full scale ultra advanced app for Ovi - an open source project that can generate videos from both text prompts and image + text prompts with real audio.

Project page is here : https://aaxwaz.github.io/Ovi/

I have developed an ultra advanced Gradio app and much better pipeline that fully supports block swapping

Now we can generate full quality videos with as low as 8.2 GB VRAM

Hopefully I will work on dynamic on load FP8_Scaled tomorrow to improve VRAM even further

So more VRAM optimizations will come hopefully tomorrow

Our implemented block swapping is the very best one out there - I took the approach from famous Kohya Musubi tuner

The 1-click installer will install into Python 3.10.11 venv and will auto download models as well so it is literally 1-click

My installer auto installs with Torch 2.8, CUDA 12.9, Flash Attention 2.8.3 and it supports literally all GPUs like RTX 3000 series, 4000 series, 5000 series, H100, B200, etc

All generations will be saved inside outputs folder and we support so many features like batch folder processing, number of generations, full preset save and load

This is a rush release (in less than a day) so there can be errors please let me know and I will hopefully improve the app

Look the examples to understand how to prompt the model that is extremely important


RTX 5090 can run it without any block swap with just cpu-offloading - really fast
  • 2 replies
·
SelmaNajih001 
posted an update 2 days ago
view post
Post
2531
Introducing a Hugging Face Tutorial on Regression

While Hugging Face offers extensive tutorials on classification and NLP tasks, there is very little guidance on performing regression tasks with Transformers.
In my latest article, I provide a step-by-step guide to running regression using Hugging Face, applying it to financial news data to predict stock returns.
In this tutorial, you will learn how to:
-Prepare and preprocess textual and numerical data for regression
-Configure a Transformer model for regression tasks
-Apply the model to real-world financial datasets with fully reproducible code

Read the full article here: https://huggingface.co/blog/SelmaNajih001/how-to-run-a-regression-using-hugging-face
The dataset used: SelmaNajih001/FinancialClassification
  • 1 reply
·
Kseniase 
posted an update about 19 hours ago
view post
Post
617
8 Emerging trends in Reinforcement Learning

Reinforcement learning is having a moment - and not just this week. Some of its directions are already showing huge promise, while others are still early but exciting. Here’s a look at what’s happening right now in RL:

1. Reinforcement Pre-Training (RPT) → Reinforcement Pre-Training (2506.08007)
Reframes next-token pretraining as RL with verifiable rewards, yielding scalable reasoning gains

2. Reinforcement Learning from Human Feedback (RLHF) → Deep reinforcement learning from human preferences (1706.03741)
The top approach. It trains a model using human preference feedback, building a reward model and then optimizing the policy to generate outputs people prefer

3. Reinforcement Learning with Verifiable Rewards (RLVR) → Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs (2506.14245)
Moves from subjective (human-labeled) rewards to objective ones that can be automatically verified, like in math, code, or rubrics as reward, for example → Reinforcement Learning with Rubric Anchors (2508.12790), Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains (2507.17746)

4. Multi-objective RL → Pareto Multi-Objective Alignment for Language Models (2508.07768)
Trains LMs to balance multiple goals at once, like being helpful but also concise or creative, ensuring that improving one goal doesn’t ruin another

5. Parallel thinking RL → Parallel-R1: Towards Parallel Thinking via Reinforcement Learning (2509.07980)
Trains parallel chains of thought, boosting math accuracy and final ceilings. It first teaches the model “parallel thinking” skill on easier problems, then uses RL to refine it on harder ones

Read further below ⬇️
And if you like this, subscribe to the Turing post: https://www.turingpost.com/subscribe

Also, check out our recent guide about the past, present and future of RL: https://www.turingpost.com/p/rlguide
  • 2 replies
·
Parveshiiii 
posted an update 1 day ago
view post
Post
2658
Ever wanted an open‑source deep research agent? Meet Deepresearch‑Agent 🔍🤖

1. Multi‑step reasoning: Reflects between steps, fills gaps, iterates until evidence is solid.

2. Research‑augmented: Generates queries, searches, synthesizes, and cites sources.

3. Fullstack + LLM‑friendly: React/Tailwind frontend, LangGraph/FastAPI backend; works with OpenAI/Gemini.


🔗 GitHub: https://github.com/Parveshiiii/Deepresearch-Agent
prithivMLmods 
posted an update 3 days ago
view post
Post
4310
Try the Hugging Face Space demo for Logics-MLLM/Logics-Parsing, the latest multimodal VLM from the Logics Team at Alibaba Group. It enables end-to-end document parsing with precise content extraction in markdown format, and it also generates a clean HTML representation of the document while preserving its logical structure. 🤗🔥

Additionally, I’ve integrated one of my recent works — prithivMLmods/Gliese-OCR-7B-Post1.0 — which also excels at document comprehension.

⭐ Space / App : prithivMLmods/VLM-Parsing
📄 Technical Report by the Logics Team, Alibaba Group : Logics-Parsing Technical Report (2509.19760)
🖖 MM: VLM-Parsing: prithivMLmods/mm-vlm-parsing-68e33e52bfb9ae60b50602dc
⚡ Collections : prithivMLmods/multimodal-implementations-67c9982ea04b39f0608badb0

Other Pages:

➔ Multimodal VLMs - July'25 : prithivMLmods/multimodal-vlms-until-july25-688312e6b840e1e156f13027
➔ Multimodal VLMs - Aug'25 : prithivMLmods/multimodal-vlms-aug25-68a56aac39fe8084f3c168bd
➔ VL caption — < Sep 15 ’25 : prithivMLmods/vl-caption-sep-15-25-68c7f6d737985c63c13e2391

.
.
.
To know more about it, visit the app page or the respective model page!!
andywu-kby 
posted an update 6 days ago
view post
Post
2269
Hello everyone,
I hope you’re doing well.

We’re currently developing a chatbot that can analyze and forecast sales directly from Excel files. Do you think this would be useful?

Miragic-AI/Miragic-Sales-Pilot

Please share your feedback by 👍 or 👎 this post.

Best regards,
sergiopaniego 
posted an update 12 days ago
Xenova 
posted an update Aug 6
view post
Post
4209
The next generation of AI-powered websites is going to be WILD! 🤯

In-browser tool calling & MCP is finally here, allowing LLMs to interact with websites programmatically.

To show what's possible, I built a demo using Liquid AI's new LFM2 model, powered by 🤗 Transformers.js: LiquidAI/LFM2-WebGPU

As always, the demo is open source (which you can find under the "Files" tab), so I'm excited to see how the community builds upon this! 🚀
  • 2 replies
·
Nymbo 
posted an update 2 days ago
view post
Post
221
I have a few Sora-2 invites - 15509N