Stephen Genusa's picture

Stephen Genusa PRO

StephenGenusa
ยท

AI & ML interests

LFM, LLM, Quantization, Vision, RAG/Hybrid/Graph, Multimodality, NLP (will take us further down the road with existing LLM tech)

Recent Activity

liked a model 18 days ago
prithivMLmods/Triangulum-10B
View all activity

Organizations

Social Post Explorers's profile picture

StephenGenusa's activity

posted an update 5 days ago
view post
Post
1135
I have a pro account and I am logged in. I have duplicated a space due to the error "You have exceeded your GPU quota", I am showing 0 GPU use, yet I am unable to use it "You have exceeded your GPU quota (60s requested vs. 44s left). Create a free account to get more daily usage quota." "Expert Support" is a pitch for consulting.
ยท
reacted to vincentg64's post with ๐Ÿ”ฅ 23 days ago
view post
Post
2232
LLM 2.0, RAG & Non-Standard Gen AI on GitHub https://mltblog.com/3DsyZSq

In this article, I share my latest Gen AI and LLM advances, featuring innovative approaches radically different from both standard AI and classical ML/NLP. The focus is on doing better with less, using efficient architectures, new algorithms and evaluation metrics. It originates from research that I started long ago. It gained significant momentum in the last two years. See background and history at https://mltblog.com/4g2sKTv.

OpenAI, Perplexity, Anthropic, Llama and others typically follow the trend and implement solutions very similar to mines within 3 to 6 months after I publish new milestones. For instance, multi-tokens, knowledge graph tokens, multi-indexes, real-time fine-tuning, mixtures of experts, LLM routers, small enterprise sub-LLMs, prompt distillation, relevancy scoring engine, deep contextual retrieval, optimum agentic chunking, and modern UI instead of the basic prompt box. I keep adding new features all the time, staying ahead of competition.

โžก๏ธ Read full article with links to GitHub, at https://mltblog.com/3DsyZSq
  • 1 reply
ยท
reacted to m-ric's post with ๐Ÿš€ 4 months ago
view post
Post
1284
๐—”๐—ฑ๐—ฑ ๐˜€๐—ผ๐˜‚๐—ฟ๐—ฐ๐—ฒ ๐—ต๐—ถ๐—ด๐—ต๐—น๐—ถ๐—ด๐—ต๐˜๐—ถ๐—ป๐—ด ๐˜๐—ผ ๐˜†๐—ผ๐˜‚๐—ฟ ๐—ฅ๐—”๐—š ๐˜€๐˜†๐˜€๐˜๐—ฒ๐—บ! ๐Ÿ“„๐Ÿ’ก

RAG systems are supposed to make your LLM's answer more trustworthy, by inserting in the prompt some supporting documents from a knowledge base : we say that we're "adding some context".

๐Ÿ‘Ž But if you don't know which part of the answer has been generated based on which input tokens, it's hard to tell wether it was effectively grounded in the context knowledge or not!

๐Ÿค” I've been working on the question: is it possible to add notes to the answer linking to which part of the context they're generated from?

And I've found a great solution: a great technique called Layer-wise Relevance Propagation (LRP), showcased in a paper at ICML `24 by Reduan Achtibat et al allows, allows to precisely score how important each input token was in generating your output! They've made it into a library called LXT.

๐Ÿ“Š For each generated output token, LXT gives you attribution scores for each input token.

โš™๏ธ So I've worked a bit more on aggregating these scores into meaningful spans between successive input and output tokens, and I finally obtained my desired result: RAG with source highlighting!

Try the demo here ๐Ÿ‘‰ m-ric/rag_highlights

Caveats:
- It slows down generation (for now quite a lot, could hopefully be reduced a lot)
- For now it supports only specific models: Llama models and Mixtral

If there's enough interest in this solution, I can improve it further and spin it off into a specific library for RAG! ๐Ÿš€
reacted to Wauplin's post with ๐Ÿค— 4 months ago
view post
Post
2937
What a great milestone to celebrate! The huggingface_hub library is slowly becoming a cornerstone of the Python ML ecosystem when it comes to interacting with the @huggingface Hub. It wouldn't be there without the hundreds of community contributions and feedback! No matter if you are loading a model, sharing a dataset, running remote inference or starting jobs on our infra, you are for sure using it! And this is only the beginning so give a star if you wanna follow the project ๐Ÿ‘‰ https://github.com/huggingface/huggingface_hub
  • 1 reply
ยท
New activity in mattshumer/ref_70_e3 4 months ago
replied to m-ric's post 4 months ago
view reply

I think there will be a big breakthrough as well, but I'd be surprised if it happens soon. If it does, I'd be happy. While the architectures of LLMs continue to advance I don't see any evidence that significant progress is being made and I personally think the architectures are too primitive and inherently self-limiting. I am also a believer that bigger does not necessarily mean better. I think we've reached the limits or are near the point of reaching the limits of where size dictates how powerful the LLM is.

Therefore, I think, given the current architectural limitations, the external limits, namely those dictated by power availability, and the many resources/costs of building better LLMs, will slow AI development until a radical change comes along.

We've managed to survive without them and now that we have them, they are a great step forward and we'll continue using and improving what we have. There are many improvements that can be made around the LLM using NLP to improve what we expect from LLMs and that's where the focus will turn for the time being, such as xLLM. Better architectures are going to have to take into account the difference in statistical models of representations of the world and the way humans communicate through speech and writing.

replied to vincentg64's post 5 months ago
view reply

Vincent, thank you for your time, effort and especially for your willingness to share your expertise. I am really looking forward to this!

reacted to vincentg64's post with โค๏ธ 5 months ago
view post
Post
1454
Hyperfast Contextual Custom LLM with Agents, Multitokens, Explainable AI, and Distillation https://mltblog.com/4dNPSnB

New additions to this ground-breaking system include multi-token distillation when processing prompts, agents to meet user intent, more NLP, and a command prompt menu accepting both standard prompts and various actions.

I also added several illustrations, featuring xLLM in action with a full session and sample commands to fine-tune in real-time. All the code, input sources (anonymized corporate corpus from fortune 100 company), contextual backend tables including embeddings, are on GitHub. My system has zero weight, no transformer, and no neural network. It relies on explainable AI, does not require training, is fully reproducible, and fits in memory. Yet your prompts can retrieve relevant full text entities from the corpus with no latency โ€” including URLs, categories, titles, email addresses, and so on โ€” thanks to well-designed architecture.

Read more, get the code, paper and everything for free, at https://mltblog.com/4dNPSnB
  • 2 replies
ยท
reacted to ybelkada's post with ๐Ÿ”ฅ 5 months ago
reacted to MonsterMMORPG's post with ๐Ÿš€โค๏ธ๐Ÿ”ฅ 6 months ago
view post
Post
5282
FLUX Local & Cloud Tutorial With SwarmUI - FLUX: The Groundbreaking Open Source txt2img Model Outperforms Midjourney & Others - FLUX: The Anticipated Successor to SD3

๐Ÿ”— Comprehensive Tutorial Video Link โ–ถ๏ธ https://youtu.be/bupRePUOA18

FLUX represents a milestone in open source txt2img technology, delivering superior quality and more accurate prompt adherence than #Midjourney, Adobe Firefly, Leonardo Ai, Playground Ai, Stable Diffusion, SDXL, SD3, and Dall E3. #FLUX, a creation of Black Forest Labs, boasts a team largely comprised of #StableDiffusion's original developers, and its output quality is truly remarkable. This statement is not hyperbole; you'll witness its capabilities in the tutorial. This guide will demonstrate how to effortlessly install and utilize FLUX models on your personal computer and cloud platforms like Massed Compute, RunPod, and a complimentary Kaggle account.

๐Ÿ”— FLUX Setup Guide (publicly accessible) โคต๏ธ
โ–ถ๏ธ https://www.patreon.com/posts/106135985

๐Ÿ”— FLUX Models One-Click Robust Automatic Downloader Scripts โคต๏ธ
โ–ถ๏ธ https://www.patreon.com/posts/109289967

๐Ÿ”— Primary Windows SwarmUI Tutorial (Essential for Usage Instructions) โคต๏ธ
โ–ถ๏ธ https://youtu.be/HKX8_F1Er_w

๐Ÿ”— Cloud-based SwarmUI Tutorial (Massed Compute - RunPod - Kaggle) โคต๏ธ
โ–ถ๏ธ https://youtu.be/XFUZof6Skkw

๐Ÿ”— SECourses Discord Server for Comprehensive Support โคต๏ธ
โ–ถ๏ธ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

๐Ÿ”— SECourses Reddit Community โคต๏ธ
โ–ถ๏ธ https://www.reddit.com/r/SECourses/

๐Ÿ”— SECourses GitHub Repository โคต๏ธ
โ–ถ๏ธ https://github.com/FurkanGozukara/Stable-Diffusion

๐Ÿ”— Official FLUX 1 Launch Announcement Blog Post โคต๏ธ
โ–ถ๏ธ https://blackforestlabs.ai/announcing-black-forest-labs/

Video Segments

0:00 Introduction to the state-of-the-art open source txt2img model FLUX
5:01 Process for integrating FLUX model into SwarmUI
....