DavidGF commited on
Commit
47e01b4
1 Parent(s): d81f6d7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +165 -0
README.md ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [![Discord](https://img.shields.io/discord/1156064224225808488?logo=Discord&logoColor=%23ffffff&label=Discord&link=https%3A%2F%2Fdiscord.gg%2FtCMkMDDHwm)](https://discord.gg/cognitivecomputations)
2
+ Discord: https://discord.gg/cognitivecomputations
3
+ ![Kraken](https://vago-solutions.de/wp-content/uploads/2024/05/Kraken_Pic.png "Kraken")
4
+
5
+
6
+ ## Overview
7
+
8
+ The Kraken model and Architecture **Kraken** is a **joint effort** between **Cognitive Computations**, **VAGO Solutions** and **Hyperspace.ai.**
9
+
10
+ Created by **Fernando Fernandes Neto**, **David Golchinfar**, **Lucas Atkins** and **Eric Hartford**
11
+
12
+ The Kraken model combining the best Python, SQL, Function Calling, Reasoning and foreign Models so far.
13
+
14
+ The Kraken Architecture is a sophisticated machine learning framework designed for dynamic text generation tasks. It utilizes the Hugging Face transformers library to orchestrate multiple causal language models (CLMs) and intelligently route input through different models based on the context and content of the input text. The architecture is powered by a custom configuration class (KrakenConfig) that facilitates the integration and management of various components such as tokenizers, models, and routing mechanisms.
15
+
16
+ ## Features
17
+
18
+ Dynamic Model Routing: Uses a sequence classification model to route inputs to the most suitable language model based on the input's characteristics.
19
+ Multiple Language Models: Supports integration of various pre-trained causal language models, allowing for flexible, context-appropriate responses.
20
+ Customizable Templates: Includes support for input formatting using predefined templates, enhancing the model's adaptability to different conversational contexts.
21
+ Extensible Configuration: Leverages a custom configuration setup that can be easily extended and adapted for various use cases involving causal language modeling.
22
+
23
+ ## Selected Models as Experts:
24
+ ```
25
+ "Reasoning Expert": "microsoft/Phi-3-medium-128k-instruct",
26
+ "Function Calling Expert": "gorilla-llm/gorilla-openfunctions-v2",
27
+ "Python Expert": "ise-uiuc/Magicoder-S-DS-6.7B",
28
+ "SQL Expert": "defog/llama-3-sqlcoder-8b",
29
+ "German Expert": "VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct"
30
+ ```
31
+
32
+ **How to load and call Kraken model :**
33
+ ```
34
+ from transformers import AutoConfig, AutoModelForCausalLM
35
+ from configuration_kraken import KrakenConfig
36
+ from modeling_kraken import KrakenForCausalLM
37
+
38
+ AutoConfig.register("kraken", KrakenConfig)
39
+ AutoModelForCausalLM.register(KrakenConfig, KrakenForCausalLM)
40
+
41
+ device = "cuda:0" ## Setup "cuda:0" if NVIDIA, "mps" if on Mac
42
+
43
+ # Load the model and config:
44
+ config = AutoConfig.from_pretrained("./kraken_model")
45
+ model = AutoModelForCausalLM.from_pretrained("./kraken_model", config=config, trust_remote_code=True)
46
+ ```
47
+
48
+ # Call the Reasoning expert:
49
+ ```
50
+ messages = [
51
+ {'role': 'system', 'content': '"You are a helpful AI Assistant'},
52
+ {'role': 'user', 'content': "Find the mass percentage of Ba in BaO"}
53
+ ]
54
+
55
+ tokenizer = model.tokenizer
56
+ input_text = tokenizer.apply_chat_template(messages, tokenize=False)
57
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(device)
58
+ output_ids = model.generate(input_ids, max_length=250)
59
+ print(model.expert_tokenizer(text=input_text).decode(output_ids[0], skip_special_tokens=True))
60
+ ```
61
+
62
+
63
+
64
+ # Call the Function Calling Expert:
65
+ ```
66
+ messages = [
67
+ {'role': 'system', 'content': """You are a helpful assistant with access to the following functions. Use them if required -
68
+ { "name": "calculate_area", "description": "Calculate the area of a shape", "parameters": { "type": "object", "properties": { "shape": { "type": "string", "description": "The type of shape (circle, rectangle, triangle, etc.)" }, "dimensions": { "type": "object", "properties": { "length": { "type": "number", "description": "The length of the shape" }, "width": { "type": "number", "description": "The width of the shape" }, "base": { "type": "number", "description": "The base of the shape" }, "height": { "type": "number", "description": "The height of the shape" } } } }, "required": [ "shape" ] } }"""},
69
+ {'role': 'user', 'content': """I need to calculate the area of a rectangle. The length is 5 and the width is 3."""}
70
+ ]
71
+
72
+ tokenizer = model.tokenizer
73
+ input_text = tokenizer.apply_chat_template(messages, tokenize=False)
74
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda:0")
75
+ output_ids = model.generate(input_ids ,temperature=0.1, do_sample=True, top_p=0.9,top_k=20, max_length=500)
76
+ print(model.expert_tokenizer(text=input_text).decode(output_ids[0], skip_special_tokens=True))
77
+ ```
78
+
79
+ # Call the Python Expert:
80
+ ```
81
+ messages = [
82
+ {'role': 'system', 'content': ''},
83
+ {'role': 'user', 'content': """Create a python function to calculate the sum of a sequence of integers.
84
+ [1, 2, 3, 4, 5]"""}
85
+ ]
86
+
87
+ tokenizer = model.tokenizer
88
+ input_text = tokenizer.apply_chat_template(messages, tokenize=False)
89
+ print(input_text)
90
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda:0")
91
+ output_ids = model.generate(input_ids ,temperature=0.6, do_sample=True, top_p=0.9,top_k=20, max_length=400)
92
+ print(model.expert_tokenizer(text=input_text).decode(output_ids[0], skip_special_tokens=True))
93
+ ```
94
+
95
+ # Call the SQL expert:
96
+ ```
97
+ messages = [
98
+ {'role': 'system', 'content': 'You are a helpul AI assistant.'},
99
+ {'role': 'user', 'content': """Generate a SQL query to answer this question: What is the total volume of timber sold by each salesperson, sorted by salesperson?
100
+
101
+ DDL statements:
102
+ CREATE TABLE salesperson (salesperson_id INT, name TEXT, region TEXT); INSERT INTO salesperson (salesperson_id, name, region) VALUES (1, 'John Doe', 'North'), (2, 'Jane Smith', 'South'); CREATE TABLE timber_sales (sales_id INT, salesperson_id INT, volume REAL, sale_date DATE); INSERT INTO timber_sales (sales_id, salesperson_id, volume, sale_date) VALUES (1, 1, 120, '2021-01-01'), (2, 1, 150, '2021-02-01'), (3, 2, 180, '2021-01-01');"""}
103
+ ]
104
+
105
+ tokenizer = model.tokenizer
106
+ input_text = tokenizer.apply_chat_template(messages, tokenize=False)
107
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to(device)
108
+ output_ids = model.generate(input_ids ,temperature=0.6, do_sample=True, top_p=0.9,top_k=20, max_length=500)
109
+ print(model.expert_tokenizer(text=input_text).decode(output_ids[0], skip_special_tokens=True))
110
+ ```
111
+
112
+ # Call the German expert:
113
+ ```
114
+ messages = [
115
+ {'role': 'system', 'content': 'Du bist ein freundlicher und hilfreicher deutscher KI-Assistent'},
116
+ {'role': 'user', 'content': "Ich hoffe es geht dir gut?"}
117
+ ]
118
+
119
+ tokenizer = model.tokenizer
120
+ input_text = tokenizer.apply_chat_template(messages, tokenize=False)
121
+ input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda:0")
122
+ output_ids = model.generate(input_ids, max_length=150)
123
+ print(model.expert_tokenizer(text=input_text).decode(output_ids[0], skip_special_tokens=True))
124
+ ```
125
+
126
+
127
+ # Switch expert and or quantization:
128
+ Go into the config file of the kraken_model folder
129
+ ```
130
+ "models": {
131
+ "expert1": "microsoft/Phi-3-medium-128k-instruct", # Switch to a Resoning Expert of your choice
132
+ "expert2": "gorilla-llm/gorilla-openfunctions-v2", # Switch to a Function Calling Expert of your choice
133
+ "expert3": "ise-uiuc/Magicoder-S-DS-6.7B", # Switch to a Python Expert of your choice
134
+ "expert4": "defog/llama-3-sqlcoder-8b", # Switch to a SQL Expert of your choice
135
+ "expert5": "VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct" # Switch to a German Expert of your choice
136
+ },
137
+ # Currently supported: "4bit","8bit" and "awq"
138
+ "quantization": {
139
+ "expert1": null,
140
+ "expert2": null,
141
+ "expert3": null,
142
+ "expert4": null,
143
+ "expert5": null
144
+ },
145
+ "router": "kraken_router",
146
+ # Adjust the tokenizer to your selected model
147
+ "tokenizers": {
148
+ "expert1": "microsoft/Phi-3-medium-128k-instruct",
149
+ "expert2": "gorilla-llm/gorilla-openfunctions-v2",
150
+ "expert3": "ise-uiuc/Magicoder-S-DS-6.7B",
151
+ "expert4": "defog/llama-3-sqlcoder-8b",
152
+ "expert5": "VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct"
153
+ }
154
+ },
155
+ "model_type": "kraken",
156
+ "torch_dtype": "float32",
157
+ "transformers_version": "4.41.0"
158
+ }
159
+
160
+
161
+ ```
162
+
163
+ ## Cite As
164
+
165
+ Fernando Fernandes Neto, David Golchinfar, Lucas Atkins, Eric Hartford - [Kraken: An OpenSource Collection of Experts Model, 2024](https://github.com/cognitivecomputations/kraken)