hunterbown commited on
Commit
0608478
·
verified ·
1 Parent(s): 54cd21a

fix: update notebook to work directly from HuggingFace

Browse files
Files changed (1) hide show
  1. notebooks/SCU_Demo.ipynb +382 -388
notebooks/SCU_Demo.ipynb CHANGED
@@ -1,391 +1,385 @@
1
  {
2
- "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {
6
- "id": "header"
7
- },
8
- "source": [
9
- "# Shannon Control Unit (SCU) Demo\n",
10
- "\n",
11
- "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Hmbown/shannon-control-unit/blob/main/notebooks/SCU_Demo.ipynb)\n",
12
- "\n",
13
- "This notebook demonstrates the Shannon Control Unit - an adaptive regularization system that achieves **15.6% lower perplexity** without manual hyperparameter tuning."
14
- ]
15
- },
16
- {
17
- "cell_type": "markdown",
18
- "metadata": {
19
- "id": "installation"
20
- },
21
- "source": [
22
- "## 1. Installation\n",
23
- "\n",
24
- "First, install the required packages:"
25
- ]
26
- },
27
- {
28
- "cell_type": "code",
29
- "execution_count": null,
30
- "metadata": {
31
- "id": "install_packages"
32
- },
33
- "outputs": [],
34
- "source": [
35
- "!pip install -q transformers peft torch accelerate\n",
36
- "!pip install -q matplotlib pandas numpy"
37
- ]
38
- },
39
- {
40
- "cell_type": "markdown",
41
- "metadata": {
42
- "id": "load_model"
43
- },
44
- "source": [
45
- "## 2. Load Model with SCU Adapter\n",
46
- "\n",
47
- "Load the base Llama model and apply the SCU-trained adapter:"
48
- ]
49
- },
50
- {
51
- "cell_type": "code",
52
- "execution_count": null,
53
- "metadata": {
54
- "id": "load_model_code"
55
- },
56
- "outputs": [],
57
- "source": [
58
- "import torch\n",
59
- "from transformers import AutoModelForCausalLM, AutoTokenizer\n",
60
- "from peft import PeftModel\n",
61
- "\n",
62
- "# Check available device\n",
63
- "device = 'cuda' if torch.cuda.is_available() else 'cpu'\n",
64
- "print(f'Using device: {device}')\n",
65
- "\n",
66
- "# Load base model\n",
67
- "base_model_id = 'meta-llama/Llama-3.2-1B'\n",
68
- "print(f'Loading base model: {base_model_id}...')\n",
69
- "\n",
70
- "base_model = AutoModelForCausalLM.from_pretrained(\n",
71
- " base_model_id,\n",
72
- " device_map='auto',\n",
73
- " torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,\n",
74
- " trust_remote_code=True\n",
75
- ")\n",
76
- "\n",
77
- "tokenizer = AutoTokenizer.from_pretrained(base_model_id)\n",
78
- "if tokenizer.pad_token is None:\n",
79
- " tokenizer.pad_token = tokenizer.eos_token\n",
80
- "\n",
81
- "print('Base model loaded successfully!')"
82
- ]
83
- },
84
- {
85
- "cell_type": "code",
86
- "execution_count": null,
87
- "metadata": {
88
- "id": "load_adapter"
89
- },
90
- "outputs": [],
91
- "source": [
92
- "# Load SCU adapter\n",
93
- "adapter_id = 'hunterbown/shannon-control-unit'\n",
94
- "print(f'Loading SCU adapter: {adapter_id}...')\n",
95
- "\n",
96
- "model = PeftModel.from_pretrained(base_model, adapter_id)\n",
97
- "model.eval()\n",
98
- "\n",
99
- "print('SCU adapter loaded successfully!')\n",
100
- "print(f'Model ready for inference on {device}')"
101
- ]
102
- },
103
- {
104
- "cell_type": "markdown",
105
- "metadata": {
106
- "id": "generation"
107
- },
108
- "source": [
109
- "## 3. Generate Text\n",
110
- "\n",
111
- "Test the model with different prompts:"
112
- ]
113
- },
114
- {
115
- "cell_type": "code",
116
- "execution_count": null,
117
- "metadata": {
118
- "id": "generate_function"
119
- },
120
- "outputs": [],
121
- "source": [
122
- "def generate_text(prompt, max_length=100, temperature=0.7):\n",
123
- " \"\"\"Generate text using the SCU model.\"\"\"\n",
124
- " inputs = tokenizer(prompt, return_tensors='pt').to(device)\n",
125
- " \n",
126
- " with torch.no_grad():\n",
127
- " outputs = model.generate(\n",
128
- " **inputs,\n",
129
- " max_length=max_length,\n",
130
- " temperature=temperature,\n",
131
- " do_sample=True,\n",
132
- " pad_token_id=tokenizer.pad_token_id\n",
133
- " )\n",
134
- " \n",
135
- " generated = tokenizer.decode(outputs[0], skip_special_tokens=True)\n",
136
- " return generated\n",
137
- "\n",
138
- "# Test generation\n",
139
- "test_prompt = 'The key to understanding information theory is'\n",
140
- "print(f'Prompt: {test_prompt}')\n",
141
- "print('-' * 50)\n",
142
- "print(generate_text(test_prompt))"
143
- ]
144
- },
145
- {
146
- "cell_type": "markdown",
147
- "metadata": {
148
- "id": "examples"
149
- },
150
- "source": [
151
- "## 4. Try Different Examples\n",
152
- "\n",
153
- "Test the model on various tasks:"
154
- ]
155
- },
156
- {
157
- "cell_type": "code",
158
- "execution_count": null,
159
- "metadata": {
160
- "id": "code_generation"
161
- },
162
- "outputs": [],
163
- "source": [
164
- "# Code generation\n",
165
- "code_prompt = 'def fibonacci(n):'\n",
166
- "print('Code Generation Example')\n",
167
- "print('=' * 50)\n",
168
- "print(generate_text(code_prompt, max_length=150, temperature=0.3))"
169
- ]
170
- },
171
- {
172
- "cell_type": "code",
173
- "execution_count": null,
174
- "metadata": {
175
- "id": "math_problem"
176
- },
177
- "outputs": [],
178
- "source": [
179
- "# Math explanation\n",
180
- "math_prompt = 'To solve a quadratic equation, you need to'\n",
181
- "print('Math Explanation Example')\n",
182
- "print('=' * 50)\n",
183
- "print(generate_text(math_prompt, max_length=120, temperature=0.5))"
184
- ]
185
- },
186
- {
187
- "cell_type": "code",
188
- "execution_count": null,
189
- "metadata": {
190
- "id": "creative_writing"
191
- },
192
- "outputs": [],
193
- "source": [
194
- "# Creative writing\n",
195
- "story_prompt = 'In a world where AI controls'\n",
196
- "print('Creative Writing Example')\n",
197
- "print('=' * 50)\n",
198
- "print(generate_text(story_prompt, max_length=150, temperature=0.9))"
199
- ]
200
- },
201
- {
202
- "cell_type": "markdown",
203
- "metadata": {
204
- "id": "evaluation"
205
- },
206
- "source": [
207
- "## 5. Evaluate Performance\n",
208
- "\n",
209
- "Compare SCU model perplexity to baseline:"
210
- ]
211
- },
212
- {
213
- "cell_type": "code",
214
- "execution_count": null,
215
- "metadata": {
216
- "id": "evaluate_perplexity"
217
- },
218
- "outputs": [],
219
- "source": [
220
- "import math\n",
221
- "\n",
222
- "def calculate_perplexity(model, text, tokenizer):\n",
223
- " \"\"\"Calculate perplexity for given text.\"\"\"\n",
224
- " inputs = tokenizer(text, return_tensors='pt', truncation=True, max_length=512).to(device)\n",
225
- " \n",
226
- " with torch.no_grad():\n",
227
- " outputs = model(**inputs, labels=inputs['input_ids'])\n",
228
- " loss = outputs.loss\n",
229
- " perplexity = math.exp(loss.item())\n",
230
- " \n",
231
- " return perplexity\n",
232
- "\n",
233
- "# Test text for evaluation\n",
234
- "test_text = \"\"\"\n",
235
- "Machine learning is a subset of artificial intelligence that enables \n",
236
- "systems to learn and improve from experience without being explicitly \n",
237
- "programmed. It focuses on developing computer programs that can access \n",
238
- "data and use it to learn for themselves.\n",
239
- "\"\"\"\n",
240
- "\n",
241
- "# Calculate perplexity\n",
242
- "scu_perplexity = calculate_perplexity(model, test_text, tokenizer)\n",
243
- "print(f'SCU Model Perplexity: {scu_perplexity:.2f}')\n",
244
- "print(f'Baseline Perplexity (reported): 15.14')\n",
245
- "print(f'Improvement: {(15.14 - scu_perplexity) / 15.14 * 100:.1f}%')"
246
- ]
247
- },
248
- {
249
- "cell_type": "markdown",
250
- "metadata": {
251
- "id": "visualization"
252
- },
253
- "source": [
254
- "## 6. Visualize Control Dynamics\n",
255
- "\n",
256
- "Show how SCU maintains the target compression ratio during training:"
257
- ]
258
- },
259
- {
260
- "cell_type": "code",
261
- "execution_count": null,
262
- "metadata": {
263
- "id": "plot_control"
264
- },
265
- "outputs": [],
266
- "source": [
267
- "import matplotlib.pyplot as plt\n",
268
- "import numpy as np\n",
269
- "\n",
270
- "# Simulate control dynamics (for demonstration)\n",
271
- "steps = np.arange(0, 270)\n",
272
- "target_s = 0.01 # 1% target\n",
273
- "deadband = 0.002 # ±0.2pp\n",
274
- "\n",
275
- "# Simulated S(t) converging to target\n",
276
- "s_values = target_s + 0.02 * np.exp(-steps/50) * np.sin(steps/10) + np.random.normal(0, 0.0005, len(steps))\n",
277
- "s_values = np.clip(s_values, 0, 0.03)\n",
278
- "\n",
279
- "# Plot\n",
280
- "plt.figure(figsize=(10, 6))\n",
281
- "plt.plot(steps, s_values * 100, 'b-', linewidth=2, label='S(t)')\n",
282
- "plt.axhspan((target_s - deadband) * 100, (target_s + deadband) * 100, \n",
283
- " alpha=0.2, color='green', label=f'Target: {target_s*100:.1f}% ± {deadband*100:.1f}pp')\n",
284
- "plt.axhline(target_s * 100, color='green', linestyle='--', alpha=0.5)\n",
285
- "\n",
286
- "plt.xlabel('Training Step', fontsize=12)\n",
287
- "plt.ylabel('S (%)', fontsize=12)\n",
288
- "plt.title('SCU Control: S(t) Tracking Target', fontsize=14, fontweight='bold')\n",
289
- "plt.legend()\n",
290
- "plt.grid(True, alpha=0.3)\n",
291
- "plt.tight_layout()\n",
292
- "plt.show()\n",
293
- "\n",
294
- "print('The plot shows how SCU maintains the compression ratio S within the target band.')\n",
295
- "print('This automatic control eliminates the need for manual hyperparameter tuning.')"
296
- ]
297
- },
298
- {
299
- "cell_type": "markdown",
300
- "metadata": {
301
- "id": "comparison"
302
- },
303
- "source": [
304
- "## 7. Performance Comparison\n",
305
- "\n",
306
- "Compare SCU with baseline model:"
307
- ]
308
- },
309
- {
310
- "cell_type": "code",
311
- "execution_count": null,
312
- "metadata": {
313
- "id": "compare_models"
314
- },
315
- "outputs": [],
316
- "source": [
317
- "# Performance metrics\n",
318
- "results = {\n",
319
- " 'Metric': ['Bits per Token', 'Perplexity', 'Compression Ratio'],\n",
320
- " 'Baseline': [3.920, 15.14, '0.0%'],\n",
321
- " 'SCU': [3.676, 12.78, '1.0%'],\n",
322
- " 'Improvement': ['-6.2%', '-15.6%', 'Controlled']\n",
323
- "}\n",
324
- "\n",
325
- "# Display as table\n",
326
- "import pandas as pd\n",
327
- "df = pd.DataFrame(results)\n",
328
- "print('\\nPerformance Comparison: Baseline vs SCU')\n",
329
- "print('=' * 60)\n",
330
- "print(df.to_string(index=False))\n",
331
- "print('=' * 60)\n",
332
- "print('\\nKey Achievement: 15.6% perplexity reduction with automatic tuning!')"
333
- ]
334
- },
335
- {
336
- "cell_type": "markdown",
337
- "metadata": {
338
- "id": "conclusion"
339
- },
340
- "source": [
341
- "## 8. Conclusion\n",
342
- "\n",
343
- "The Shannon Control Unit demonstrates:\n",
344
- "\n",
345
- "- **15.6% lower perplexity** compared to baseline\n",
346
- "- **Automatic regularization** without manual tuning\n",
347
- "- **Stable control** maintaining 1% ± 0.2pp compression ratio\n",
348
- "- **Generalizable approach** across model scales\n",
349
- "\n",
350
- "### Next Steps\n",
351
- "\n",
352
- "1. Try different prompts to explore model capabilities\n",
353
- "2. Fine-tune your own models with SCU control\n",
354
- "3. Read the [paper](https://arxiv.org/abs/xxxx.xxxxx) for technical details\n",
355
- "\n",
356
- "### Resources\n",
357
- "\n",
358
- "- Model: [hunterbown/shannon-control-unit](https://huggingface.co/hunterbown/shannon-control-unit)\n",
359
- "- GitHub: [shannon-control-unit](https://github.com/Hmbown/shannon-control-unit)\n",
360
- "- Contact: hunter@shannonlabs.dev"
361
- ]
362
- }
363
- ],
364
- "metadata": {
365
- "accelerator": "GPU",
366
- "colab": {
367
- "name": "SCU_Demo.ipynb",
368
- "provenance": [],
369
- "toc_visible": true
370
- },
371
- "kernelspec": {
372
- "display_name": "Python 3",
373
- "language": "python",
374
- "name": "python3"
375
- },
376
- "language_info": {
377
- "codemirror_mode": {
378
- "name": "ipython",
379
- "version": 3
380
- },
381
- "file_extension": ".py",
382
- "mimetype": "text/x-python",
383
- "name": "python",
384
- "nbconvert_exporter": "python",
385
- "pygments_lexer": "ipython3",
386
- "version": "3.10.12"
387
- }
388
  },
389
- "nbformat": 4,
390
- "nbformat_minor": 0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
391
  }
 
1
  {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {
6
+ "id": "header"
7
+ },
8
+ "source": "# Shannon Control Unit (SCU) Demo\n\n[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://huggingface.co/hunterbown/shannon-control-unit/blob/main/notebooks/SCU_Demo.ipynb)\n\nThis notebook demonstrates the Shannon Control Unit - an adaptive regularization system that achieves **15.6% lower perplexity** without manual hyperparameter tuning.\n\n**Note:** Click the \"Open in Colab\" badge above, then in the Colab interface, click File → Save a copy in Drive to run the notebook."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  },
10
+ {
11
+ "cell_type": "markdown",
12
+ "metadata": {
13
+ "id": "installation"
14
+ },
15
+ "source": [
16
+ "## 1. Installation\n",
17
+ "\n",
18
+ "First, install the required packages:"
19
+ ]
20
+ },
21
+ {
22
+ "cell_type": "code",
23
+ "execution_count": null,
24
+ "metadata": {
25
+ "id": "install_packages"
26
+ },
27
+ "outputs": [],
28
+ "source": [
29
+ "!pip install -q transformers peft torch accelerate\n",
30
+ "!pip install -q matplotlib pandas numpy"
31
+ ]
32
+ },
33
+ {
34
+ "cell_type": "markdown",
35
+ "metadata": {
36
+ "id": "load_model"
37
+ },
38
+ "source": [
39
+ "## 2. Load Model with SCU Adapter\n",
40
+ "\n",
41
+ "Load the base Llama model and apply the SCU-trained adapter:"
42
+ ]
43
+ },
44
+ {
45
+ "cell_type": "code",
46
+ "execution_count": null,
47
+ "metadata": {
48
+ "id": "load_model_code"
49
+ },
50
+ "outputs": [],
51
+ "source": [
52
+ "import torch\n",
53
+ "from transformers import AutoModelForCausalLM, AutoTokenizer\n",
54
+ "from peft import PeftModel\n",
55
+ "\n",
56
+ "# Check available device\n",
57
+ "device = 'cuda' if torch.cuda.is_available() else 'cpu'\n",
58
+ "print(f'Using device: {device}')\n",
59
+ "\n",
60
+ "# Load base model\n",
61
+ "base_model_id = 'meta-llama/Llama-3.2-1B'\n",
62
+ "print(f'Loading base model: {base_model_id}...')\n",
63
+ "\n",
64
+ "base_model = AutoModelForCausalLM.from_pretrained(\n",
65
+ " base_model_id,\n",
66
+ " device_map='auto',\n",
67
+ " torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,\n",
68
+ " trust_remote_code=True\n",
69
+ ")\n",
70
+ "\n",
71
+ "tokenizer = AutoTokenizer.from_pretrained(base_model_id)\n",
72
+ "if tokenizer.pad_token is None:\n",
73
+ " tokenizer.pad_token = tokenizer.eos_token\n",
74
+ "\n",
75
+ "print('Base model loaded successfully!')"
76
+ ]
77
+ },
78
+ {
79
+ "cell_type": "code",
80
+ "execution_count": null,
81
+ "metadata": {
82
+ "id": "load_adapter"
83
+ },
84
+ "outputs": [],
85
+ "source": [
86
+ "# Load SCU adapter\n",
87
+ "adapter_id = 'hunterbown/shannon-control-unit'\n",
88
+ "print(f'Loading SCU adapter: {adapter_id}...')\n",
89
+ "\n",
90
+ "model = PeftModel.from_pretrained(base_model, adapter_id)\n",
91
+ "model.eval()\n",
92
+ "\n",
93
+ "print('SCU adapter loaded successfully!')\n",
94
+ "print(f'Model ready for inference on {device}')"
95
+ ]
96
+ },
97
+ {
98
+ "cell_type": "markdown",
99
+ "metadata": {
100
+ "id": "generation"
101
+ },
102
+ "source": [
103
+ "## 3. Generate Text\n",
104
+ "\n",
105
+ "Test the model with different prompts:"
106
+ ]
107
+ },
108
+ {
109
+ "cell_type": "code",
110
+ "execution_count": null,
111
+ "metadata": {
112
+ "id": "generate_function"
113
+ },
114
+ "outputs": [],
115
+ "source": [
116
+ "def generate_text(prompt, max_length=100, temperature=0.7):\n",
117
+ " \"\"\"Generate text using the SCU model.\"\"\"\n",
118
+ " inputs = tokenizer(prompt, return_tensors='pt').to(device)\n",
119
+ " \n",
120
+ " with torch.no_grad():\n",
121
+ " outputs = model.generate(\n",
122
+ " **inputs,\n",
123
+ " max_length=max_length,\n",
124
+ " temperature=temperature,\n",
125
+ " do_sample=True,\n",
126
+ " pad_token_id=tokenizer.pad_token_id\n",
127
+ " )\n",
128
+ " \n",
129
+ " generated = tokenizer.decode(outputs[0], skip_special_tokens=True)\n",
130
+ " return generated\n",
131
+ "\n",
132
+ "# Test generation\n",
133
+ "test_prompt = 'The key to understanding information theory is'\n",
134
+ "print(f'Prompt: {test_prompt}')\n",
135
+ "print('-' * 50)\n",
136
+ "print(generate_text(test_prompt))"
137
+ ]
138
+ },
139
+ {
140
+ "cell_type": "markdown",
141
+ "metadata": {
142
+ "id": "examples"
143
+ },
144
+ "source": [
145
+ "## 4. Try Different Examples\n",
146
+ "\n",
147
+ "Test the model on various tasks:"
148
+ ]
149
+ },
150
+ {
151
+ "cell_type": "code",
152
+ "execution_count": null,
153
+ "metadata": {
154
+ "id": "code_generation"
155
+ },
156
+ "outputs": [],
157
+ "source": [
158
+ "# Code generation\n",
159
+ "code_prompt = 'def fibonacci(n):'\n",
160
+ "print('Code Generation Example')\n",
161
+ "print('=' * 50)\n",
162
+ "print(generate_text(code_prompt, max_length=150, temperature=0.3))"
163
+ ]
164
+ },
165
+ {
166
+ "cell_type": "code",
167
+ "execution_count": null,
168
+ "metadata": {
169
+ "id": "math_problem"
170
+ },
171
+ "outputs": [],
172
+ "source": [
173
+ "# Math explanation\n",
174
+ "math_prompt = 'To solve a quadratic equation, you need to'\n",
175
+ "print('Math Explanation Example')\n",
176
+ "print('=' * 50)\n",
177
+ "print(generate_text(math_prompt, max_length=120, temperature=0.5))"
178
+ ]
179
+ },
180
+ {
181
+ "cell_type": "code",
182
+ "execution_count": null,
183
+ "metadata": {
184
+ "id": "creative_writing"
185
+ },
186
+ "outputs": [],
187
+ "source": [
188
+ "# Creative writing\n",
189
+ "story_prompt = 'In a world where AI controls'\n",
190
+ "print('Creative Writing Example')\n",
191
+ "print('=' * 50)\n",
192
+ "print(generate_text(story_prompt, max_length=150, temperature=0.9))"
193
+ ]
194
+ },
195
+ {
196
+ "cell_type": "markdown",
197
+ "metadata": {
198
+ "id": "evaluation"
199
+ },
200
+ "source": [
201
+ "## 5. Evaluate Performance\n",
202
+ "\n",
203
+ "Compare SCU model perplexity to baseline:"
204
+ ]
205
+ },
206
+ {
207
+ "cell_type": "code",
208
+ "execution_count": null,
209
+ "metadata": {
210
+ "id": "evaluate_perplexity"
211
+ },
212
+ "outputs": [],
213
+ "source": [
214
+ "import math\n",
215
+ "\n",
216
+ "def calculate_perplexity(model, text, tokenizer):\n",
217
+ " \"\"\"Calculate perplexity for given text.\"\"\"\n",
218
+ " inputs = tokenizer(text, return_tensors='pt', truncation=True, max_length=512).to(device)\n",
219
+ " \n",
220
+ " with torch.no_grad():\n",
221
+ " outputs = model(**inputs, labels=inputs['input_ids'])\n",
222
+ " loss = outputs.loss\n",
223
+ " perplexity = math.exp(loss.item())\n",
224
+ " \n",
225
+ " return perplexity\n",
226
+ "\n",
227
+ "# Test text for evaluation\n",
228
+ "test_text = \"\"\"\n",
229
+ "Machine learning is a subset of artificial intelligence that enables \n",
230
+ "systems to learn and improve from experience without being explicitly \n",
231
+ "programmed. It focuses on developing computer programs that can access \n",
232
+ "data and use it to learn for themselves.\n",
233
+ "\"\"\"\n",
234
+ "\n",
235
+ "# Calculate perplexity\n",
236
+ "scu_perplexity = calculate_perplexity(model, test_text, tokenizer)\n",
237
+ "print(f'SCU Model Perplexity: {scu_perplexity:.2f}')\n",
238
+ "print(f'Baseline Perplexity (reported): 15.14')\n",
239
+ "print(f'Improvement: {(15.14 - scu_perplexity) / 15.14 * 100:.1f}%')"
240
+ ]
241
+ },
242
+ {
243
+ "cell_type": "markdown",
244
+ "metadata": {
245
+ "id": "visualization"
246
+ },
247
+ "source": [
248
+ "## 6. Visualize Control Dynamics\n",
249
+ "\n",
250
+ "Show how SCU maintains the target compression ratio during training:"
251
+ ]
252
+ },
253
+ {
254
+ "cell_type": "code",
255
+ "execution_count": null,
256
+ "metadata": {
257
+ "id": "plot_control"
258
+ },
259
+ "outputs": [],
260
+ "source": [
261
+ "import matplotlib.pyplot as plt\n",
262
+ "import numpy as np\n",
263
+ "\n",
264
+ "# Simulate control dynamics (for demonstration)\n",
265
+ "steps = np.arange(0, 270)\n",
266
+ "target_s = 0.01 # 1% target\n",
267
+ "deadband = 0.002 # ±0.2pp\n",
268
+ "\n",
269
+ "# Simulated S(t) converging to target\n",
270
+ "s_values = target_s + 0.02 * np.exp(-steps/50) * np.sin(steps/10) + np.random.normal(0, 0.0005, len(steps))\n",
271
+ "s_values = np.clip(s_values, 0, 0.03)\n",
272
+ "\n",
273
+ "# Plot\n",
274
+ "plt.figure(figsize=(10, 6))\n",
275
+ "plt.plot(steps, s_values * 100, 'b-', linewidth=2, label='S(t)')\n",
276
+ "plt.axhspan((target_s - deadband) * 100, (target_s + deadband) * 100, \n",
277
+ " alpha=0.2, color='green', label=f'Target: {target_s*100:.1f}% ± {deadband*100:.1f}pp')\n",
278
+ "plt.axhline(target_s * 100, color='green', linestyle='--', alpha=0.5)\n",
279
+ "\n",
280
+ "plt.xlabel('Training Step', fontsize=12)\n",
281
+ "plt.ylabel('S (%)', fontsize=12)\n",
282
+ "plt.title('SCU Control: S(t) Tracking Target', fontsize=14, fontweight='bold')\n",
283
+ "plt.legend()\n",
284
+ "plt.grid(True, alpha=0.3)\n",
285
+ "plt.tight_layout()\n",
286
+ "plt.show()\n",
287
+ "\n",
288
+ "print('The plot shows how SCU maintains the compression ratio S within the target band.')\n",
289
+ "print('This automatic control eliminates the need for manual hyperparameter tuning.')"
290
+ ]
291
+ },
292
+ {
293
+ "cell_type": "markdown",
294
+ "metadata": {
295
+ "id": "comparison"
296
+ },
297
+ "source": [
298
+ "## 7. Performance Comparison\n",
299
+ "\n",
300
+ "Compare SCU with baseline model:"
301
+ ]
302
+ },
303
+ {
304
+ "cell_type": "code",
305
+ "execution_count": null,
306
+ "metadata": {
307
+ "id": "compare_models"
308
+ },
309
+ "outputs": [],
310
+ "source": [
311
+ "# Performance metrics\n",
312
+ "results = {\n",
313
+ " 'Metric': ['Bits per Token', 'Perplexity', 'Compression Ratio'],\n",
314
+ " 'Baseline': [3.920, 15.14, '0.0%'],\n",
315
+ " 'SCU': [3.676, 12.78, '1.0%'],\n",
316
+ " 'Improvement': ['-6.2%', '-15.6%', 'Controlled']\n",
317
+ "}\n",
318
+ "\n",
319
+ "# Display as table\n",
320
+ "import pandas as pd\n",
321
+ "df = pd.DataFrame(results)\n",
322
+ "print('\\nPerformance Comparison: Baseline vs SCU')\n",
323
+ "print('=' * 60)\n",
324
+ "print(df.to_string(index=False))\n",
325
+ "print('=' * 60)\n",
326
+ "print('\\nKey Achievement: 15.6% perplexity reduction with automatic tuning!')"
327
+ ]
328
+ },
329
+ {
330
+ "cell_type": "markdown",
331
+ "metadata": {
332
+ "id": "conclusion"
333
+ },
334
+ "source": [
335
+ "## 8. Conclusion\n",
336
+ "\n",
337
+ "The Shannon Control Unit demonstrates:\n",
338
+ "\n",
339
+ "- **15.6% lower perplexity** compared to baseline\n",
340
+ "- **Automatic regularization** without manual tuning\n",
341
+ "- **Stable control** maintaining 1% ± 0.2pp compression ratio\n",
342
+ "- **Generalizable approach** across model scales\n",
343
+ "\n",
344
+ "### Next Steps\n",
345
+ "\n",
346
+ "1. Try different prompts to explore model capabilities\n",
347
+ "2. Fine-tune your own models with SCU control\n",
348
+ "3. Read the [paper](https://arxiv.org/abs/xxxx.xxxxx) for technical details\n",
349
+ "\n",
350
+ "### Resources\n",
351
+ "\n",
352
+ "- Model: [hunterbown/shannon-control-unit](https://huggingface.co/hunterbown/shannon-control-unit)\n",
353
+ "- GitHub: [shannon-control-unit](https://github.com/Hmbown/shannon-control-unit)\n",
354
+ "- Contact: hunter@shannonlabs.dev"
355
+ ]
356
+ }
357
+ ],
358
+ "metadata": {
359
+ "accelerator": "GPU",
360
+ "colab": {
361
+ "name": "SCU_Demo.ipynb",
362
+ "provenance": [],
363
+ "toc_visible": true
364
+ },
365
+ "kernelspec": {
366
+ "display_name": "Python 3",
367
+ "language": "python",
368
+ "name": "python3"
369
+ },
370
+ "language_info": {
371
+ "codemirror_mode": {
372
+ "name": "ipython",
373
+ "version": 3
374
+ },
375
+ "file_extension": ".py",
376
+ "mimetype": "text/x-python",
377
+ "name": "python",
378
+ "nbconvert_exporter": "python",
379
+ "pygments_lexer": "ipython3",
380
+ "version": "3.10.12"
381
+ }
382
+ },
383
+ "nbformat": 4,
384
+ "nbformat_minor": 0
385
  }