Stephen Genusa PRO

StephenGenusa

AI & ML interests

LFM, LLM, Quantization, Vision, RAG/Hybrid/Graph, Multimodality, NLP (will take us further down the road with existing LLM tech)

Recent Activity

reacted to openfree's post with ➕ 18 days ago

🚀 Llama-4 Model-Based Agentic AI System Released! 🔥 Introducing the Latest Llama-4 Models Hello AI enthusiasts! Today we're excited to introduce our free API service powered by the cutting-edge Llama-4-Maverick-17B and Llama-4-Scout-17B models! These state-of-the-art models will upgrade your AI experience with remarkable stability and speed. Link1: https://huggingface.co/spaces/openfree/Llama-4-Maverick-17B-Research Link2: https://huggingface.co/spaces/openfree/Llama-4-Scout-17B-Research 🧠 The Innovation of Agentic AI: Deep Research Feature The standout feature of our service is the revolutionary "Deep Research" functionality! This innovative Agentic AI system includes: 🔍 Optimized Keyword Extraction: LLM automatically generates the most effective keywords for searches 🌐 Real-time Web Search: Collects the latest information through the SerpHouse API 📊 Intelligent Information Analysis: Precise analysis utilizing the LLM's reasoning capabilities based on collected information 📝 Contextualized Response Generation: Provides accurate answers incorporating the latest information from search results ⚡ Key Advantages 💯 Free API Service: Stable and fast LLM service through Fireworks AI 🧩 Easy Integration: Accessible through a simple Gradio interface 🔄 Streaming Responses: Minimized waiting time with real-time generated responses 🌍 Multilingual Support: Automatic detection and processing of various languages including Korean 🛠️ Technical Features The Llama-4-Maverick-17B model supports a context window of up to 20,480 tokens and automatically integrates web search results to always respond with the most current information. The model analyzes collected information through complex reasoning processes and constructs the most appropriate response to user queries. 🤝 Community Participation For more information and discussions, please join our Discord community (https://discord.gg/openfreeai)! Let's shape the future of AI together! Start now!

reacted to openfree's post with 👍 18 days ago

reacted to openfree's post with 🤯 18 days ago

View all activity

Organizations

StephenGenusa's activity

reacted to openfree's post with ➕👍🤯🤗😎👀🚀🔥❤️ 18 days ago

Post

8106

🚀 Llama-4 Model-Based Agentic AI System Released!

🔥 Introducing the Latest Llama-4 Models
Hello AI enthusiasts! Today we're excited to introduce our free API service powered by the cutting-edge Llama-4-Maverick-17B and Llama-4-Scout-17B models! These state-of-the-art models will upgrade your AI experience with remarkable stability and speed.

Link1: openfree/Llama-4-Maverick-17B-Research
Link2: openfree/Llama-4-Scout-17B-Research

🧠 The Innovation of Agentic AI: Deep Research Feature
The standout feature of our service is the revolutionary "Deep Research" functionality! This innovative Agentic AI system includes:

🔍 Optimized Keyword Extraction: LLM automatically generates the most effective keywords for searches
🌐 Real-time Web Search: Collects the latest information through the SerpHouse API
📊 Intelligent Information Analysis: Precise analysis utilizing the LLM's reasoning capabilities based on collected information
📝 Contextualized Response Generation: Provides accurate answers incorporating the latest information from search results

⚡ Key Advantages

💯 Free API Service: Stable and fast LLM service through Fireworks AI
🧩 Easy Integration: Accessible through a simple Gradio interface
🔄 Streaming Responses: Minimized waiting time with real-time generated responses
🌍 Multilingual Support: Automatic detection and processing of various languages including Korean

🛠️ Technical Features
The Llama-4-Maverick-17B model supports a context window of up to 20,480 tokens and automatically integrates web search results to always respond with the most current information. The model analyzes collected information through complex reasoning processes and constructs the most appropriate response to user queries.

🤝 Community Participation
For more information and discussions, please join our Discord community (https://discord.gg/openfreeai)! Let's shape the future of AI together!

Start now!

5 replies

reacted to bartowski's post with 👍 24 days ago

Post

73075

Switching to author_model-name

I posted a poll on twitter, and others have mentioned the interest in me using the convention of including the author name in the model path when I upload.

It has a couple advantages, first and foremost of course is ensuring clarity of who uploaded the original model (did Qwen upload Qwen2.6? Or did someone fine tune Qwen2.5 and named it 2.6 for fun?)

The second thing is that it avoids collisions, so if multiple people upload the same model and I try to quant them both, I would normally end up colliding and being unable to upload both

I'll be implementing the change next week, there are just two final details I'm unsure about:

First, should the files also inherit the author's name?

Second, what to do in the case that the author name + model name pushes us past the character limit?

Haven't yet decided how to handle either case, so feedback is welcome, but also just providing this as a "heads up"

5 replies

liked a Space 29 days ago

Pangolin

🐨

PangolinGuard Demo

reacted to tomaarsen's post with ❤️ 29 days ago

Post

2449

‼️Sentence Transformers v4.0 is out! You can now train and finetune reranker models with multi-GPU training, bf16 support, loss logging, callbacks & much more. I also prove that finetuning on your domain helps much more than you might think.

1️⃣ Reranker Training Refactor
Reranker models can now be trained using an extensive trainer with a lot of powerful features:
- MultiGPU Training (Data Parallelism (DP) and Distributed Data Parallelism (DDP))
- bf16 training support; loss logging
- Evaluation datasets + evaluation loss
- Improved callback support + an excellent Weights & Biases integration
- Gradient checkpointing, gradient accumulation
- Model card generation
- Resuming from a training checkpoint without performance loss
- Hyperparameter Optimization
and much more!

Read my detailed blogpost to learn about the components that make up this new training approach: https://huggingface.co/blog/train-reranker
Notably, the release is fully backwards compatible: all deprecations are soft, meaning that they still work but emit a warning informing you how to upgrade.

2️⃣ New Reranker Losses
- 11 new losses:
- 2 traditional losses: BinaryCrossEntropy and CrossEntropy
- 2 distillation losses: MSE and MarginMSE
- 2 in-batch negatives losses: MNRL (a.k.a. InfoNCE) and CMNRL
- 5 learning to rank losses: Lambda, p-ListMLE, ListNet, RankNet, ListMLE

3️⃣ New Reranker Documentation
- New Training Overview, Loss Overview, API Reference docs
- 5 new, 1 refactored training examples docs pages
- 13 new, 6 refactored training scripts
- Migration guides (2.x -> 3.x, 3.x -> 4.x)

4️⃣ Blogpost
Alongside the release, I've written a blogpost where I finetune ModernBERT on a generic question-answer dataset. My finetunes easily outperform all general-purpose reranker models, even models 4x as big. Finetuning on your domain is definitely worth it: https://huggingface.co/blog/train-reranker

See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/v4.0.1

liked a model 29 days ago

deepseek-ai/DeepSeek-V3-0324

Text Generation • Updated 28 days ago • 252k • • 2.73k

liked a model 4 months ago

prithivMLmods/Triangulum-10B

Text Generation • Updated Jan 4 • 23 • 4

reacted to vincentg64's post with 🔥 4 months ago

Post

2246

LLM 2.0, RAG & Non-Standard Gen AI on GitHub https://mltblog.com/3DsyZSq

In this article, I share my latest Gen AI and LLM advances, featuring innovative approaches radically different from both standard AI and classical ML/NLP. The focus is on doing better with less, using efficient architectures, new algorithms and evaluation metrics. It originates from research that I started long ago. It gained significant momentum in the last two years. See background and history at https://mltblog.com/4g2sKTv.

OpenAI, Perplexity, Anthropic, Llama and others typically follow the trend and implement solutions very similar to mines within 3 to 6 months after I publish new milestones. For instance, multi-tokens, knowledge graph tokens, multi-indexes, real-time fine-tuning, mixtures of experts, LLM routers, small enterprise sub-LLMs, prompt distillation, relevancy scoring engine, deep contextual retrieval, optimum agentic chunking, and modern UI instead of the basic prompt box. I keep adding new features all the time, staying ahead of competition.

➡️ Read full article with links to GitHub, at https://mltblog.com/3DsyZSq

1 reply

liked a model 4 months ago

bespokelabs/Bespoke-MiniCheck-7B

Text Classification • Updated Dec 20, 2024 • 1.68k • 73

upvoted a paper 6 months ago

SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26, 2024 • 74