Tim Dolan's picture

Tim Dolan

macadeliccc

·

AI & ML interests

None yet

Organizations

Posts 10

Post

1082

My tool calling playgrounds repo has been updated again to include the use of flux1-schnell or dev image generation. This functionality is similar to using Dall-E 3 via the @ decorator in ChatGPT. Once the function is selected, the model will either extract or improve your prompt (depending on how you ask).

I have also included 2 notebooks that cover different ways to access Flux for your specific use case. The first method covers how to access flux via LitServe from Lightning AI. LitServe is a bare-bones inference engine with a focus on modularity rather than raw performance. LitServe supports text generation models as well as image generation, which is great for some use cases, but does not provide the caching mechanisms from a dedicated image generation solution.

Since dedicated caching mechanisms are so crucial to performance, I also included an example for how to integrate SwarmUI/ComfyUI to utilize a more dedicated infrastructure that may already be running as part of your tech stack. Resulting in a Llama-3.1 capable of utilizing specific ComfyUI JSON configs, and many different settings.

Lastly, I tested the response times for each over a small batch request to simulate a speed test.

It becomes clear quickly how efficient caching mechanisms can greatly reduce the generation time, even in a scenario where another model is called. An average 4.5 second response time is not bad at all when you consider that an 8B model is calling a 12B parameter model for a secondary generation.

Repo: https://github.com/tdolan21/tool-calling-playground
LitServe: https://github.com/Lightning-AI/LitServe
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

Articles 1

Article

3

Deploy hundreds of open source models on one GPU using LoRAX

View all Articles

Collections 11

spaces 10

Running on Zero

Laser Dolphin MoE Chat

Chat with an advanced AI assistant

SOLAR-math-2x10.7b MoE-Chat

Argilla Space Template

SOLAR Math v0.2 Chat GGUF

Orca SOLAR 4x10.7b Chat GGUF

Piccolo Math MoE Chat GGUF

models 135

macadeliccc/magistrate-3.2-3b-it-GGUF

Text Generation • Updated Oct 1, 2024 • 95 • 1

macadeliccc/magistrate-3.2-3b-it

Text Generation • Updated Oct 1, 2024 • 2

macadeliccc/magistrate-3.2-3b-base

Text Generation • Updated Oct 1, 2024 • 143 • 1

macadeliccc/distil-gemma-2-2b

Updated Aug 2, 2024 • 2 • 1

macadeliccc/llama-3-8b-instruct-pte

Updated Jul 31, 2024

macadeliccc/gemma-2b-legal-summary-lora

Text Generation • Updated Jul 27, 2024 • 1 • 1

macadeliccc/gemma-2b-openai-content-moderation

Text Generation • Updated Jul 8, 2024 • 10 • 1

macadeliccc/gemma-2b-pubmed-classifier

Text Generation • Updated Jun 30, 2024 • 1 • 1

macadeliccc/gemma-2b-orca-mwp-lora-v0.2

Text Generation • Updated Jun 29, 2024

macadeliccc/Qwen2-7B-AWQ

Text Generation • Updated Jun 28, 2024

datasets 15

macadeliccc/US-SupremeCourt-RAG

Viewer • Updated Nov 4, 2024 • 1.1k • 30 • 1

macadeliccc/US-LegalKit

Viewer • Updated Aug 4, 2024 • 524k • 136 • 6

macadeliccc/US-FederalLaws

Viewer • Updated Jul 24, 2024 • 111k • 33 • 3

macadeliccc/US-SupremeCourtVerdicts

Viewer • Updated Jul 9, 2024 • 1.1k • 22 • 1

macadeliccc/distilabel-neurology-preferences-2k-cleaner

Viewer • Updated Jul 4, 2024 • 1.99k • 45

macadeliccc/opus_samantha

Viewer • Updated Jun 21, 2024 • 3.19k • 118 • 21

macadeliccc/truthy-dpo-v0.1-orca-format

Viewer • Updated Feb 28, 2024 • 1.02k • 37 • 1

macadeliccc/distilabel-neurology-preferences-2k-orca-format

Viewer • Updated Feb 22, 2024 • 1.99k • 23 • 1

macadeliccc/distilabel-neurology-preferences-2k

Viewer • Updated Feb 18, 2024 • 2k • 39

macadeliccc/distilabel-neurology-instructions

Viewer • Updated Feb 15, 2024 • 4k • 42 • 1