Jan Jakubcik's picture
1 1

Jan Jakubcik

Johnz86

AI & ML interests

None yet

Recent Activity

liked a dataset about 1 month ago
dyyyyyyyy/ScaleQuest-Math
View all activity

Organizations

None yet

Johnz86's activity

Reacted to DavidGF's post with πŸ”₯ 6 months ago
view post
Post
1563
The kraken has awakened!
A Game-Changer in LLM Flexibility and Performance!

Over the past few weeks, VAGO solutions teamed up with Cognitive Computations and HyperSpace to develop a groundbreaking architecture that redefines flexibility in combining different LLM into one model.

@fernandofernandes , me, @Crystalcareai , @ehartford created the Kraken!

What Can It Do? πŸ™
βœ… Versatile Architecture: Kraken allows the seamless combination of LLMs with varying sizes, quantizations, and model architectures. It currently supports quantizations in 4-bit, 8-bit, and AWQ, with more on the way. And it runs on Hugging Face Transformers 4.40+

βœ… Kraken Router: Utilizing a custom sequence classification model with a context length of 32k tokens, The Kraken Router directs inputs to the most suitable Expert based on their characteristics.

βœ… Adaptability: Enhanced input formatting supports the model’s adaptability to diverse conversational contexts.

βœ… Extreme Versatility: Easily swap experts within Kraken for your specific use cases without retraining the entire model. For example, if you've built a Kraken for coding in Python you can upgrade your Python model without retraining the router or add a C# model by retraining the router.

βœ… Open Source Pipeline: We’re sharing the entire pipeline, including router creation, training, architecture setup, and Kraken inference, on JupyterNotebooks: https://github.com/cognitivecomputations/kraken

Kraken marks the beginning of an exciting new journey in #OpenSource LLM. Why? Because it empowers the open source community in accelerating the catch-up process to proprietary LLMs like #GPT and #Claude 🀩

We proudly introduce the very first 2 Kraken models, that integrates top-tier LLM and Multilingual capabilities:
cognitivecomputations/Kraken
VAGOsolutions/Kraken-Multilingual
Right now it's supported by Hugging Face transformers library. Would love to see the integration into VLM and TGWI!