Building on HF

asisai

asis-ai

https://www.linkedin.com/in/asissr/

asissr

AI & ML interests

Image/video automation, generative AI production workflows

Recent Activity

upvoted a collection about 12 hours ago

Favorites

upvoted an article about 12 hours ago

This Title Is Already Tokenized (Tokun P.2)

reacted to John6666's post with 👍 about 12 hours ago

I used up my Zero GPU Quota yesterday (about 12 hours ago). At the time, I got a message saying “Retry at 13:45 (approx.)”, but now it's just changed to “Retry at 03:22”. Anyway, everyone, let's be careful not to use up our Quota... Related: https://huggingface.co/posts/Keltezaa/754755723533287#67e6ed5e3394f1ed9ca41dbd

View all activity

Organizations

upvoted a collection about 12 hours ago

Favorites

Collection

1 item • Updated Feb 28, 2025 • 1

upvoted an article about 12 hours ago

Article

This Title Is Already Tokenized (Tokun P.2)

apehex

•

Sep 19, 2024

• 6

reacted to John6666's post with 👍 about 12 hours ago

Post

40484

I used up my Zero GPU Quota yesterday (about 12 hours ago). At the time, I got a message saying “Retry at 13:45 (approx.)”, but now it's just changed to “Retry at 03:22”.
Anyway, everyone, let's be careful not to use up our Quota...

Related: https://huggingface.co/posts/Keltezaa/754755723533287#67e6ed5e3394f1ed9ca41dbd

1 reply

upvoted an article about 12 hours ago

Article

📄 Research Papers Dataset

tegridydev

•

20 days ago

• 3

upvoted a paper about 12 hours ago

Moisesdb: A dataset for source separation beyond 4-stems

Paper • 2307.15913 • Published Jul 29, 2023 • 1

upvoted a collection about 12 hours ago

RAG/webRAG/QA

Collection

13 items • Updated 4 days ago • 1

upvoted a paper about 12 hours ago

Histoires Morales: A French Dataset for Assessing Moral Alignment

Paper • 2501.17117 • Published Jan 28, 2025 • 5

upvoted an article about 12 hours ago

Article

MIEB: The Benchmark That Stress-Tests Image-Text Embeddings Like Never Before

isaacchung

•

Apr 24, 2025

• 18

upvoted a paper 2 days ago

MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24, 2024 • 48

reacted to pbhappliedsystems's post with 👀 2 days ago

Post

2278

🚀 **New flagship dataset — and an argument about what a dataset card should be.**

Most synthetic datasets on the Hub ship row counts, a license, and little else — pipeline opaque, rejection criteria unstated, compliance unaudited. We published the opposite.

**SynthEval Cloud — Regulated-Domain Synthetic Instruction Dataset**
👉 pbhappliedsystems/syntheval-cloud-regulated-instruct-1k

**1,116** quality-gated instruction records across **7 regulated domains** (medical, legal, GDPR, privacy, education, e-commerce, transport). Every record cleared a documented cascade, not a vibe check:

- 🧪 **Dual-signal hallucination gate** — rejects only when embedding cosine *and* keyword-overlap both fail; a low score alone never rejects.
- 🔒 **Layered PII masking + independent leak audit** — a separate over-reporting scanner found **0.0% residual leak** across all 1,116 records.
- 📊 **Whole-corpus evaluation, not a sample** — MATTR **0.769**, mean cosine **0.73**, **0%** near-duplicates, **96.9%** yield.
- 🧾 **The 36 rejections ship too**, each tagged with its failing gate. Removal at the gate is the product; we show our work.

Every number on the card is a field in the evaluation_report.json shipped beside the data — full methodology + provenance (Mistral-Nemo AWQ W4A16 · vLLM 0.8.5.post1 · Modal A10G).

One release from **SynthEval**: Studio (local GPU) + Cloud (Modal+vLLM), proving quality parity across substrates.

📄 Whitepaper: https://pbhappliedsystems.com/SynthEval_Studio_and_Cloud_Quality-Gated_Synthetic_Data_Generation.pdf
🔎 Overview: https://pbhappliedsystems.com/synthetic-data.html

**CC BY 4.0** — commercial use welcome, just credit it. Need defensible synthetic data at scale? Let's talk.

— Patrick Hill, PBH Applied Systems

updated a Space 2 days ago

End Frame Background Lock

🔒

Lock background of a frame to match a reference image

published 2 Spaces 2 days ago

End Frame Background Lock

🔒

Lock background of a frame to match a reference image

Video Background Stabilizer

📽

Video background stabilization and matting.

updated a Space 3 days ago

Video Background Stabilizer

📽

Video background stabilization and matting.

asisai

AI & ML interests

Recent Activity

Organizations

asis-ai's activity

This Title Is Already Tokenized (Tokun P.2)

📄 Research Papers Dataset

MIEB: The Benchmark That Stress-Tests Image-Text Embeddings Like Never Before

End Frame Background Lock

End Frame Background Lock

Video Background Stabilizer

Video Background Stabilizer