Sébastien De Greef commited on
Commit
e3bf489
1 Parent(s): c6ed4d4

chore: Update colorFrom in README.md and index.qmd

Browse files
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  title: Quarto Template
3
  emoji: 🌖
4
- colorFrom: green
5
  colorTo: pink
6
  sdk: docker
7
  pinned: false
 
1
  ---
2
  title: Quarto Template
3
  emoji: 🌖
4
+ colorFrom: blue
5
  colorTo: pink
6
  sdk: docker
7
  pinned: false
_quarto.yml ADDED
File without changes
src/_quarto.yml CHANGED
@@ -1,7 +1,7 @@
1
  project:
2
  type: website
3
  website:
4
- title: "Open-Source AI Cookbook"
5
  sidebar:
6
  style: "docked"
7
  search: true
@@ -9,13 +9,32 @@ website:
9
  contents:
10
  - section: "About"
11
  contents:
12
- - href: index.qmd
13
- text: About Quarto
14
- - section: "LLM's"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  contents:
 
 
 
 
16
  - href: llms/index.qmd
17
  text: "LLM'xs"
18
- - section: "Open-Source AI Cookbook"
 
19
  contents:
20
  - section: "RAG Techniques"
21
  contents:
 
1
  project:
2
  type: website
3
  website:
4
+ title: "My AI Cookbook"
5
  sidebar:
6
  style: "docked"
7
  search: true
 
9
  contents:
10
  - section: "About"
11
  contents:
12
+ - href: about.qmd
13
+ text: About this Cookbook
14
+
15
+ - section: "Theory"
16
+ contents:
17
+ - href: theory/activations.qmd
18
+ text: "Activation Functions"
19
+ - href: theory/architectures.qmd
20
+ text: "Network Architectures"
21
+ - href: theory/layers.qmd
22
+ text: "Layer Types"
23
+ - href: theory/metrics.qmd
24
+ text: "Metric Types"
25
+
26
+
27
+
28
+ - section: "Large Language Models"
29
  contents:
30
+ - href: llms/prompting.qmd
31
+ text: "Prompting"
32
+ - href: theory/chainoftoughts.qmd
33
+ text: "Chain of toughts"
34
  - href: llms/index.qmd
35
  text: "LLM'xs"
36
+
37
+ - section: "Retrival Augmented Generation"
38
  contents:
39
  - section: "RAG Techniques"
40
  contents:
src/about.qmd CHANGED
@@ -2,11 +2,33 @@
2
  title: "About"
3
  ---
4
 
5
- About this site
6
- ```{mermaid}
7
- flowchart LR
8
- A[Hard edge] --> B(Round edge)
9
- B --> C{Decision}
10
- C --> D[Result one]
11
- C --> E[Result two]
12
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  title: "About"
3
  ---
4
 
5
+ **Welcome to My AI Cookbook**
6
+
7
+ This repository is my personal collection of recipes and notebooks, documenting my journey of learning and exploring various aspects of Artificial Intelligence (AI). As a self-taught AI enthusiast, I created this cookbook to serve as a knowledge base, a "how-to" guide, and a reference point for my own projects and experiments.
8
+
9
+ **The Story Behind**
10
+
11
+ Over the past year, I've been fascinated by the rapidly evolving field of AI and its endless possibilities. To deepen my understanding and skills, I embarked on a self-learning journey, diving into various AI-related projects and topics. As I progressed, I realized the importance of documenting my learnings, successes, and failures. This cookbook is the culmination of that effort, a centralized hub where I can quickly find and revisit previous projects, takeaways, and insights.
12
+
13
+ **What You'll Find Here**
14
+
15
+ This cookbook is a living repository of my AI-related projects, experiments, and learnings. You'll find a diverse range of topics, including:
16
+
17
+ * **Recipes**: Step-by-step guides for implementing various AI concepts, models, and techniques using popular libraries and frameworks.
18
+ * **Notebooks**: Interactive Jupyter notebooks containing code, explanations, and visualizations for AI-related projects and experiments.
19
+ * **Project Write-ups**: Detailed descriptions of my projects, including goals, approaches, challenges, and outcomes.
20
+ * **Takeaways and Insights**: Key learnings, best practices, and lessons learned from my AI journey.
21
+
22
+ **Goals and Objectives**
23
+
24
+ This cookbook serves several purposes:
25
+
26
+ * **Personal Knowledge Base**: A centralized hub for my AI-related knowledge, allowing me to quickly recall and build upon previous projects and learnings.
27
+ * **Self-Learning Platform**: A platform for continuous learning, experimentation, and improvement in AI.
28
+ * **Community Sharing**: A resource for others to learn from, providing a glimpse into my AI journey and experiences.
29
+
30
+ **Stay Tuned**
31
+
32
+ As I continue to explore and learn, this cookbook will evolve, incorporating new projects, recipes, and insights. I hope you find this resource helpful, and I look forward to sharing my AI journey with you.
33
+
34
+ Best regards,
src/index.qmd CHANGED
@@ -1,10 +1,16 @@
1
- ---
2
  title: "About Quarto"
3
  ---
4
 
5
  [Quarto](https://quarto.org/) is a Markdown-based documentation system that lets you write documents in Markdown or Jupyter Notebooks, and render them to a variety of formats including HTML, PDF, PowerPoint, and more.
6
  You can also use Quarto to write [books](https://quarto.org/docs/books/), create [dashboards](https://quarto.org/docs/dashboards/), and embed web applications with [Observable](https://quarto.org/docs/interactive/ojs/) and [Shinylive](https://quarto.org/docs/blog/posts/2022-10-25-shinylive-extension/).
7
-
 
 
 
 
 
 
8
  ## Getting started with Quarto
9
 
10
  Once you've created the space, click on the `Files` tab in the top right to take a look at the files which make up this Space.
 
1
+ qu---
2
  title: "About Quarto"
3
  ---
4
 
5
  [Quarto](https://quarto.org/) is a Markdown-based documentation system that lets you write documents in Markdown or Jupyter Notebooks, and render them to a variety of formats including HTML, PDF, PowerPoint, and more.
6
  You can also use Quarto to write [books](https://quarto.org/docs/books/), create [dashboards](https://quarto.org/docs/dashboards/), and embed web applications with [Observable](https://quarto.org/docs/interactive/ojs/) and [Shinylive](https://quarto.org/docs/blog/posts/2022-10-25-shinylive-extension/).
7
+ ```{mermaid}
8
+ flowchart LR
9
+ A[Hard edge] --> B(Round edge)
10
+ B --> C{Decision}
11
+ C --> D[Result one]
12
+ C --> E[Result two]
13
+ ```
14
  ## Getting started with Quarto
15
 
16
  Once you've created the space, click on the `Files` tab in the top right to take a look at the files which make up this Space.
src/llms/index.qmd CHANGED
@@ -1,9 +1,19 @@
1
  ---
2
  title: "Habits"
3
  author: "John Doe"
4
- revealjs:
5
  ---
6
 
 
 
 
 
 
 
 
 
 
 
7
  # In the morning
8
 
9
  ## Getting up
 
1
  ---
2
  title: "Habits"
3
  author: "John Doe"
4
+ format: revealjs
5
  ---
6
 
7
+ ## Getting up
8
+
9
+ - Turn off alarm
10
+ - Get out of bed
11
+
12
+ ## Going to sleep
13
+
14
+ - Get in bed
15
+ - Count sheep
16
+
17
  # In the morning
18
 
19
  ## Getting up
src/llms/llms.qmd ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: "Habits"
3
+ author: "John Doe"
4
+ format: revealjs
5
+ ---
6
+
7
+ ## Getting up
8
+
9
+ - Turn off alarm
10
+ - Get out of bed
11
+
12
+ ## Going to sleep
13
+
14
+ - Get in bed
15
+ - Count sheep
16
+ ## Quarto
17
+
18
+ Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see <https://quarto.org>.
src/llms/prompting.qmd ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Prompting LLM's
2
+
3
+ ## **I. Clarity and Specificity**
4
+
5
+ 1. **Be clear and concise**: Use simple, straightforward language to convey your request.
6
+ * Good sample: "Write a short story about a character who discovers a hidden treasure."
7
+ * Bad sample: "Create a narrative that revolves around an individual who stumbles upon a concealed riches repository."
8
+ 2. **Define specific tasks**: Clearly outline what you want the model to do.
9
+ * Good sample: "Summarize the main points of the article in 50 words."
10
+ * Bad sample: "Do something with the article, maybe summarize it or something."
11
+ 3. **Avoid ambiguity**: Use specific terms and phrases to avoid confusion.
12
+ * Good sample: "Generate a recipe for vegan chocolate cake."
13
+ * Bad sample: "Make a dessert that's healthy and yummy."
14
+
15
+ ## **II. Context and Framing**
16
+
17
+ 1. **Provide context**: Give the model a clear understanding of the topic, tone, and style you're aiming for.
18
+ * Good sample: "Write a humorous article about the benefits of procrastination, in the style of The Onion."
19
+ * Bad sample: "Write something funny about procrastination."
20
+ 2. **Frame the task**: Use language that sets the tone and direction for the response.
21
+ * Good sample: "Imagine you're a travel blogger, write a review of a fictional restaurant in Paris."
22
+ * Bad sample: "Write a review of a restaurant."
23
+
24
+ ## **III. Tone and Style**
25
+
26
+ 1. **Specify tone and style**: Use adjectives to describe the tone and style you're aiming for.
27
+ * Good sample: "Write a formal, technical report on the benefits of AI in healthcare."
28
+ * Bad sample: "Write something about AI in healthcare."
29
+ 2. **Use emotional cues**: Incorporate emotional language to evoke a specific tone or atmosphere.
30
+ * Good sample: "Write a heartfelt letter to a friend who's going through a tough time."
31
+ * Bad sample: "Write a letter to a friend."
32
+
33
+ ## **IV. Constraints and Guidelines**
34
+
35
+ 1. **Set constraints**: Provide specific guidelines on format, length, or structure.
36
+ * Good sample: "Write a sonnet about the beauty of nature, with a specific rhyme scheme and 14 lines."
37
+ * Bad sample: "Write a poem about nature."
38
+ 2. **Specify formats and structures**: Use specific formats, such as lists or tables, to guide the response.
39
+ * Good sample: "Create a table comparing the features of three different smartphones."
40
+ * Bad sample: "Write something about smartphones."
41
+
42
+ ## **V. Avoiding Bias and Assumptions**
43
+
44
+ 1. **Avoid leading language**: Phrases that imply a specific answer or perspective can influence the model's response.
45
+ * Good sample: "What are the benefits and drawbacks of using AI in healthcare?"
46
+ * Bad sample: "Why is AI the best thing to happen to healthcare?"
47
+ 2. **Use neutral language**: Avoid language that implies a particular perspective or bias.
48
+ * Good sample: "Discuss the impact of climate change on global ecosystems."
49
+ * Bad sample: "Explain why climate change is a hoax."
50
+
51
+ ## **VI. Providing Examples and References**
52
+
53
+ 1. **Provide examples**: Offer concrete examples to illustrate the desired output.
54
+ * Good sample: "Write a product description in the style of this example: [insert example]."
55
+ * Bad sample: "Write a product description."
56
+ 2. **Reference external sources**: Include references to external sources, such as books or articles, to provide context and guidance.
57
+ * Good sample: "Summarize the main points of 'The Hitchhiker's Guide to the Galaxy' in 100 words."
58
+ * Bad sample: "Write a summary of a book."
59
+
60
+ ## **VII. Feedback and Iteration**
61
+
62
+ 1. **Provide feedback**: Give the model feedback on its responses to improve future output.
63
+ * Good sample: "The previous response was too formal, can you make it more conversational?"
64
+ * Bad sample: "That was bad, try again."
65
+ 2. **Iterate and refine**: Refine your prompts based on the model's responses to achieve the desired outcome.
66
+ * Good sample: "Let's try rewriting the prompt to focus on a specific aspect of the topic."
67
+ * Bad sample: "Just try again, maybe it'll work this time."
src/theory/activations.qmd ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ## **1. Sigmoid (Logistic)**
3
+
4
+ **Formula:** σ(x) = 1 / (1 + exp(-x))
5
+
6
+ **Strengths:** Maps any real-valued number to a value between 0 and 1, making it suitable for binary classification problems.
7
+
8
+ **Weaknesses:** Saturates (i.e., output values approach 0 or 1) for large inputs, leading to vanishing gradients during backpropagation.
9
+
10
+ **Usage:** Binary classification, logistic regression.
11
+
12
+ ## **2. Hyperbolic Tangent (Tanh)**
13
+
14
+ **Formula:** tanh(x) = 2 / (1 + exp(-2x)) - 1
15
+
16
+ **Strengths:** Similar to sigmoid, but maps to (-1, 1), which can be beneficial for some models.
17
+
18
+ **Weaknesses:** Also saturates, leading to vanishing gradients.
19
+
20
+ **Usage:** Similar to sigmoid, but with a larger output range.
21
+
22
+ ## **3. Rectified Linear Unit (ReLU)**
23
+
24
+ **Formula:** f(x) = max(0, x)
25
+
26
+ **Strengths:** Computationally efficient, non-saturating, and easy to compute.
27
+
28
+ **Weaknesses:** Not differentiable at x=0, which can cause issues during optimization.
29
+
30
+ **Usage:** Default activation function in many deep learning frameworks, suitable for most neural networks.
31
+
32
+ ## **4. Leaky ReLU**
33
+
34
+ **Formula:** f(x) = max(αx, x), where α is a small constant (e.g., 0.01)
35
+
36
+ **Strengths:** Similar to ReLU, but allows a small fraction of the input to pass through, helping with dying neurons.
37
+
38
+ **Weaknesses:** Still non-differentiable at x=0.
39
+
40
+ **Usage:** Alternative to ReLU, especially when dealing with dying neurons.
41
+
42
+ ## **5. Swish**
43
+
44
+ **Formula:** f(x) = x \* g(x), where g(x) is a learned function (e.g., sigmoid or ReLU)
45
+
46
+ **Strengths:** Self-gated, adaptive, and non-saturating.
47
+
48
+ **Weaknesses:** Computationally expensive, requires additional learnable parameters.
49
+
50
+ **Usage:** Can be used in place of ReLU or other activations, but may not always outperform them.
51
+
52
+ ## **6. Softmax**
53
+
54
+ **Formula:** softmax(x) = exp(x) / Σ exp(x)
55
+
56
+ **Strengths:** Normalizes output to ensure probabilities sum to 1, making it suitable for multi-class classification.
57
+
58
+ **Weaknesses:** Only suitable for output layers with multiple classes.
59
+
60
+ **Usage:** Output layer activation for multi-class classification problems.
61
+
62
+ ## **7. Softsign**
63
+
64
+ **Formula:** f(x) = x / (1 + |x|)
65
+
66
+ **Strengths:** Similar to sigmoid, but with a more gradual slope.
67
+
68
+ **Weaknesses:** Not commonly used, may not provide significant benefits over sigmoid or tanh.
69
+
70
+ **Usage:** Alternative to sigmoid or tanh in certain situations.
71
+
72
+ ## **8. ArcTan**
73
+
74
+ **Formula:** f(x) = arctan(x)
75
+
76
+ **Strengths:** Non-saturating, smooth, and continuous.
77
+
78
+ **Weaknesses:** Not commonly used, may not outperform other activations.
79
+
80
+ **Usage:** Experimental or niche applications.
81
+
82
+ ## **9. SoftPlus**
83
+
84
+ **Formula:** f(x) = log(1 + exp(x))
85
+
86
+ **Strengths:** Smooth, continuous, and non-saturating.
87
+
88
+ **Weaknesses:** Not commonly used, may not outperform other activations.
89
+
90
+ **Usage:** Experimental or niche applications.
91
+
92
+ ## **10. Gaussian Error Linear Unit (GELU)**
93
+
94
+ **Formula:** f(x) = x \* Φ(x), where Φ is the cumulative distribution function of the standard normal distribution
95
+
96
+ **Strengths:** Non-saturating, smooth, and computationally efficient.
97
+
98
+ **Weaknesses:** Not as well-studied as ReLU or other activations.
99
+
100
+ **Usage:** Alternative to ReLU, especially in Bayesian neural networks.
101
+
102
+ ## **11. Mish**
103
+
104
+ **Formula:** f(x) = x \* tanh(softplus(x))
105
+
106
+ **Strengths:** Non-saturating, smooth, and computationally efficient.
107
+
108
+ **Weaknesses:** Not as well-studied as ReLU or other activations.
109
+
110
+ **Usage:** Alternative to ReLU, especially in computer vision tasks.
111
+
112
+ ## **12. Silu (SiLU)**
113
+
114
+ **Formula:** f(x) = x \* sigmoid(x)
115
+
116
+ **Strengths:** Non-saturating, smooth, and computationally efficient.
117
+
118
+ **Weaknesses:** Not as well-studied as ReLU or other activations.
119
+
120
+ **Usage:** Alternative to ReLU, especially in computer vision tasks.
121
+
122
+ ## **13. GELU Approximation (GELU Approx.)**
123
+
124
+ **Formula:** f(x) ≈ 0.5 \* x \* (1 + tanh(√(2/π) \* (x + 0.044715 \* x^3)))
125
+
126
+ **Strengths:** Fast, non-saturating, and smooth.
127
+
128
+ **Weaknesses:** Approximation, not exactly equal to GELU.
129
+
130
+ **Usage:** Alternative to GELU, especially when computational efficiency is crucial.
131
+
132
+ ## **14. SELU (Scaled Exponential Linear Unit)**
133
+
134
+ **Formula:** f(x) = λ { x if x > 0, α(e^x - 1) if x ≤ 0 }
135
+
136
+ **Strengths:** Self-normalizing, non-saturating, and computationally efficient.
137
+
138
+ **Weaknesses:** Requires careful initialization and α tuning.
139
+
140
+ **Usage:** Alternative to ReLU, especially in deep neural networks.
141
+
142
+ When choosing an activation function, consider the following:
143
+
144
+ * **Non-saturation:** Avoid activations that saturate (e.g., sigmoid, tanh) to prevent vanishing gradients.
145
+
146
+ * **Computational efficiency:** Choose activations that are computationally efficient (e.g., ReLU, Swish) for large models or real-time applications.
147
+
148
+ * **Smoothness:** Smooth activations (e.g., GELU, Mish) can help with optimization and convergence.
149
+
150
+ * **Domain knowledge:** Select activations based on the problem domain and desired output (e.g., softmax for multi-class classification).
151
+
152
+ * **Experimentation:** Try different activations and evaluate their performance on your specific task.
src/theory/architectures.qmd ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## **1. Feedforward Neural Networks (FNNs)**
2
+
3
+ * Usage: Image classification, regression, function approximation
4
+ * Description: A basic neural network architecture where data flows only in one direction, from input layer to output layer, without any feedback loops.
5
+ * Strengths: Simple to implement, computationally efficient
6
+ * Caveats: Limited capacity to model complex relationships, prone to overfitting
7
+
8
+ ## **2. Convolutional Neural Networks (CNNs)**
9
+
10
+ * Usage: Image classification, object detection, image segmentation
11
+ * Description: A neural network architecture that uses convolutional and pooling layers to extract features from images.
12
+ * Strengths: Excellent performance on image-related tasks, robust to image transformations
13
+ * Caveats: Computationally expensive, require large datasets
14
+
15
+ ## **3. Recurrent Neural Networks (RNNs)**
16
+
17
+ * Usage: Natural Language Processing (NLP), sequence prediction, time series forecasting
18
+ * Description: A neural network architecture that uses feedback connections to model sequential data.
19
+ * Strengths: Excellent performance on sequential data, can model long-term dependencies
20
+ * Caveats: Suffer from vanishing gradients, difficult to train
21
+
22
+ ## **4. Long Short-Term Memory (LSTM) Networks**
23
+
24
+ * Usage: NLP, sequence prediction, time series forecasting
25
+ * Description: A type of RNN that uses memory cells to learn long-term dependencies.
26
+ * Strengths: Excellent performance on sequential data, can model long-term dependencies
27
+ * Caveats: Computationally expensive, require large datasets
28
+
29
+ ## **5. Transformers**
30
+
31
+ * Usage: NLP, machine translation, language modeling
32
+ * Description: A neural network architecture that uses self-attention mechanisms to model relationships between input sequences.
33
+ * Strengths: Excellent performance on sequential data, parallelizable, can handle long-range dependencies
34
+ * Caveats: Computationally expensive, require large datasets
35
+
36
+ ## **6. Autoencoders**
37
+
38
+ * Usage: Dimensionality reduction, anomaly detection, generative modeling
39
+ * Description: A neural network architecture that learns to compress and reconstruct input data.
40
+ * Strengths: Excellent performance on dimensionality reduction, can learn robust representations
41
+ * Caveats: May not perform well on complex data distributions
42
+
43
+ ## **7. Generative Adversarial Networks (GANs)**
44
+
45
+ * Usage: Generative modeling, data augmentation, style transfer
46
+ * Description: A neural network architecture that consists of a generator and discriminator, which compete to generate realistic data.
47
+ * Strengths: Excellent performance on generative tasks, can generate realistic data
48
+ * Caveats: Training can be unstable, require careful tuning of hyperparameters
49
+
50
+ ## **8. Residual Networks (ResNets)**
51
+
52
+ * Usage: Image classification, object detection
53
+ * Description: A neural network architecture that uses residual connections to ease training.
54
+ * Strengths: Excellent performance on image-related tasks, ease of training
55
+ * Caveats: May not perform well on sequential data
56
+
57
+ ## **9. U-Net**
58
+
59
+ * Usage: Image segmentation, object detection
60
+ * Description: A neural network architecture that uses a encoder-decoder structure with skip connections.
61
+ * Strengths: Excellent performance on image segmentation tasks, fast training
62
+ * Caveats: May not perform well on sequential data
63
+
64
+ ## **10. Attention-based Models**
65
+
66
+ * Usage: NLP, machine translation, question answering
67
+ * Description: A neural network architecture that uses attention mechanisms to focus on relevant input regions.
68
+ * Strengths: Excellent performance on sequential data, can model long-range dependencies
69
+ * Caveats: Require careful tuning of hyperparameters
70
+
71
+ ## **11. Graph Neural Networks (GNNs)**
72
+
73
+ * Usage: Graph-based data, social network analysis, recommendation systems
74
+ * Description: A neural network architecture that uses graph structures to model relationships between nodes.
75
+ * Strengths: Excellent performance on graph-based data, can model complex relationships
76
+ * Caveats: Computationally expensive, require large datasets
77
+
78
+ ## **12. Reinforcement Learning (RL) Architectures**
79
+
80
+ * Usage: Game playing, robotics, autonomous systems
81
+ * Description: A neural network architecture that uses reinforcement learning to learn from interactions with an environment.
82
+ * Strengths: Excellent performance on sequential decision-making tasks, can learn complex policies
83
+ * Caveats: Require large datasets, can be slow to train
84
+
85
+ ## **13. Evolutionary Neural Networks**
86
+
87
+ * Usage: Neuroevolution, optimization problems
88
+ * Description: A neural network architecture that uses evolutionary principles to evolve neural networks.
89
+ * Strengths: Excellent performance on optimization problems, can learn complex policies
90
+ * Caveats: Computationally expensive, require large datasets
91
+
92
+ ## **14. Spiking Neural Networks (SNNs)**
93
+
94
+ * Usage: Neuromorphic computing, edge AI
95
+ * Description: A neural network architecture that uses spiking neurons to process data.
96
+ * Strengths: Excellent performance on edge AI applications, energy-efficient
97
+ * Caveats: Limited software support, require specialized hardware
98
+
99
+ ## **15. Conditional Random Fields (CRFs)**
100
+
101
+ * Usage: NLP, sequence labeling, information extraction
102
+ * Description: A probabilistic model that uses graphical models to model sequential data.
103
+ * Strengths: Excellent performance on sequential data, can model complex relationships
104
+ * Caveats: Computationally expensive, require large datasets
src/theory/chainoftoughts.qmd ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # **Chain of Thoughts**
2
+
3
+ The Chain of Thoughts is a powerful technique used in artificial intelligence and cognitive architectures to model human-like reasoning and decision-making. It's a method for generating a sequence of thoughts, ideas, or concepts that are linked together to form a coherent narrative or argument.
4
+
5
+ ## **How it works:**
6
+
7
+ 1. **Seed thought**: The process starts with a seed thought, which is an initial idea or concept.
8
+ 2. **Associative thinking**: The AI system uses associative thinking to generate a new thought that is related to the seed thought.
9
+ 3. **Contextualization**: The system contextualizes the new thought within the existing knowledge graph or semantic network.
10
+ 4. **Inference**: The system draws inferences from the new thought, generating a new set of related ideas or concepts.
11
+ 5. **Iteration**: Steps 2-4 are repeated, creating a chain of thoughts that are linked together through associations, inferences, and contextualization.
12
+
13
+ ## **Benefits:**
14
+
15
+ * **Human-like reasoning**: The Chain of Thoughts technique enables AI systems to reason and think in a way that's similar to humans, making them more relatable and interactive.
16
+ * **Creative problem-solving**: By generating a chain of thoughts, AI systems can explore different solutions to complex problems, fostering creative problem-solving.
17
+ * **Natural language understanding**: The Chain of Thoughts technique can be used to improve natural language understanding by generating coherent narratives or arguments.
18
+
19
+ ## **Other similar techniques:**
20
+
21
+ 1. **Mind Mapping**: A visual technique used to organize and connect ideas, concepts, and information.
22
+ 2. **Concept Mapping**: A method for visually representing relationships between concepts, ideas, and information.
23
+ 3. **Causal Chain Analysis**: A technique used to identify cause-and-effect relationships between events or variables.
24
+ 4. **Influence Diagrams**: A graphical representation of uncertain relationships between variables, used in decision analysis and Bayesian networks.
25
+ 5. **Cognitive Maps**: A visual representation of an individual's thought processes, beliefs, and attitudes.
26
+
27
+ These techniques share similarities with the Chain of Thoughts in that they:
28
+
29
+ * Use associations and relationships to connect ideas and concepts
30
+ * Foster creative problem-solving and critical thinking
31
+ * Can be used to model human-like reasoning and decision-making
32
+ * Enable AI systems to generate coherent narratives or arguments
src/theory/layers.qmd ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ## **1. Input Layers**
3
+
4
+ * Usage: Receive input data, propagate it to subsequent layers
5
+ * Description: The first layer in a neural network that receives input data
6
+ * Strengths: Essential for processing input data, easy to implement
7
+ * Weaknesses: Limited functionality, no learning occurs in this layer
8
+
9
+ ## **2. Dense Layers (Fully Connected Layers)**
10
+
11
+ * Usage: Feature extraction, classification, regression
12
+ * Description: A layer where every input is connected to every output, using a weighted sum
13
+ * Strengths: Excellent for feature extraction, easy to implement, fast computation
14
+ * Weaknesses: Can be prone to overfitting, computationally expensive for large inputs
15
+
16
+ ## **3. Convolutional Layers (Conv Layers)**
17
+
18
+ * Usage: Image classification, object detection, image segmentation
19
+ * Description: A layer that applies filters to small regions of the input data, scanning the input data horizontally and vertically
20
+ * Strengths: Excellent for image processing, reduces spatial dimensions, retains spatial hierarchy
21
+ * Weaknesses: Computationally expensive, require large datasets
22
+
23
+ ## **4. Pooling Layers (Downsampling Layers)**
24
+
25
+ * Usage: Image classification, object detection, image segmentation
26
+ * Description: A layer that reduces spatial dimensions by taking the maximum or average value across a region
27
+ * Strengths: Reduces spatial dimensions, reduces number of parameters, retains important features
28
+ * Weaknesses: Loses some information, can be sensitive to hyperparameters
29
+
30
+ ## **5. Recurrent Layers (RNNs)**
31
+
32
+ * Usage: Natural Language Processing (NLP), sequence prediction, time series forecasting
33
+ * Description: A layer that processes sequential data, using hidden state to capture temporal dependencies
34
+ * Strengths: Excellent for sequential data, can model long-term dependencies
35
+ * Weaknesses: Suffers from vanishing gradients, difficult to train, computationally expensive
36
+
37
+ ## **6. Long Short-Term Memory (LSTM) Layers**
38
+
39
+ * Usage: NLP, sequence prediction, time series forecasting
40
+ * Description: A type of RNN that uses memory cells to learn long-term dependencies
41
+ * Strengths: Excellent for sequential data, can model long-term dependencies, mitigates vanishing gradients
42
+ * Weaknesses: Computationally expensive, require large datasets
43
+
44
+ ## **7. Gated Recurrent Unit (GRU) Layers**
45
+
46
+ * Usage: NLP, sequence prediction, time series forecasting
47
+ * Description: A simpler alternative to LSTM, using gates to control the flow of information
48
+ * Strengths: Faster computation, simpler than LSTM, easier to train
49
+ * Weaknesses: May not perform as well as LSTM, limited capacity to model long-term dependencies
50
+
51
+ ## **8. Batch Normalization Layers**
52
+
53
+ * Usage: Normalizing inputs, stabilizing training, improving performance
54
+ * Description: A layer that normalizes inputs, reducing internal covariate shift
55
+ * Strengths: Improves training stability, accelerates training, improves performance
56
+ * Weaknesses: Requires careful tuning of hyperparameters, can be computationally expensive
57
+
58
+ ## **9. Dropout Layers**
59
+
60
+ * Usage: Regularization, preventing overfitting
61
+ * Description: A layer that randomly drops out neurons during training, reducing overfitting
62
+ * Strengths: Effective regularization technique, reduces overfitting, improves generalization
63
+ * Weaknesses: Can slow down training, requires careful tuning of hyperparameters
64
+
65
+ ## **10. Flatten Layers**
66
+
67
+ * Usage: Reshaping data, preparing data for dense layers
68
+ * Description: A layer that flattens input data into a one-dimensional array
69
+ * Strengths: Essential for preparing data for dense layers, easy to implement
70
+ * Weaknesses: Limited functionality, no learning occurs in this layer
71
+
72
+ ## **11. Embedding Layers**
73
+
74
+ * Usage: NLP, word embeddings, language modeling
75
+ * Description: A layer that converts categorical data into dense vectors
76
+ * Strengths: Excellent for NLP tasks, reduces dimensionality, captures semantic relationships
77
+ * Weaknesses: Require large datasets, can be computationally expensive
78
+
79
+ ## **12. Attention Layers**
80
+
81
+ * Usage: NLP, machine translation, question answering
82
+ * Description: A layer that computes weighted sums of input data, focusing on relevant regions
83
+ * Strengths: Excellent for sequential data, can model long-range dependencies, improves performance
84
+ * Weaknesses: Computationally expensive, require careful tuning of hyperparameters
85
+
86
+ ## **13. Upsampling Layers**
87
+
88
+ * Usage: Image segmentation, object detection, image generation
89
+ * Description: A layer that increases spatial dimensions, using interpolation or learned upsampling filters
90
+ * Strengths: Excellent for image processing, improves spatial resolution, enables image generation
91
+ * Weaknesses: Computationally expensive, require careful tuning of hyperparameters
92
+
93
+ ## **14. Normalization Layers**
94
+
95
+ * Usage: Normalizing inputs, stabilizing training, improving performance
96
+ * Description: A layer that normalizes inputs, reducing internal covariate shift
97
+ * Strengths: Improves training stability, accelerates training, improves performance
98
+ * Weaknesses: Requires careful tuning of hyperparameters, can be computationally expensive
99
+
100
+ ## **15. Activation Functions**
101
+
102
+ * Usage: Introducing non-linearity, enhancing model capacity
103
+ * Description: A function that introduces non-linearity into the model, enabling complex representations
104
+ * Strengths: Enables complex representations, improves model capacity, enhances performance
105
+ * Weaknesses: Requires careful tuning of hyperparameters, can be computationally expensive
src/theory/metrics.qmd ADDED
@@ -0,0 +1,150 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # **Metrics for Model Performance Monitoring and Validation**
2
+
3
+ In machine learning, it's essential to evaluate the performance of a model to ensure it's accurate, reliable, and effective. There are various metrics to measure model performance, each with its strengths and limitations. Here's an overview of popular metrics, their pros and cons, and examples of tasks that apply to each.
4
+
5
+ ## **1. Mean Squared Error (MSE)**
6
+
7
+ MSE measures the average squared difference between predicted and actual values.
8
+
9
+ Pros:
10
+
11
+ * Easy to calculate
12
+ * Sensitive to outliers
13
+
14
+ Cons:
15
+
16
+ * Can be heavily influenced by extreme values
17
+
18
+ Example tasks:
19
+
20
+ * Regression tasks, such as predicting house prices or stock prices
21
+ * Time series forecasting
22
+
23
+ ## **2. Mean Absolute Error (MAE)**
24
+
25
+ MAE measures the average absolute difference between predicted and actual values.
26
+
27
+ Pros:
28
+
29
+ * Robust to outliers
30
+ * Easy to interpret
31
+
32
+ Cons:
33
+
34
+ * Can be sensitive to skewness in the data
35
+
36
+ Example tasks:
37
+
38
+ * Regression tasks, such as predicting house prices or stock prices
39
+ * Time series forecasting
40
+
41
+ ## **3. Mean Absolute Percentage Error (MAPE)**
42
+
43
+ MAPE measures the average absolute percentage difference between predicted and actual values.
44
+
45
+ Pros:
46
+
47
+ * Easy to interpret
48
+ * Sensitive to relative errors
49
+
50
+ Cons:
51
+
52
+ * Can be sensitive to outliers
53
+
54
+ Example tasks:
55
+
56
+ * Regression tasks, such as predicting house prices or stock prices
57
+ * Time series forecasting
58
+
59
+ ## **4. R-Squared (R²)**
60
+
61
+ R² measures the proportion of variance in the dependent variable that's explained by the independent variables.
62
+
63
+ Pros:
64
+
65
+ * Easy to interpret
66
+ * Sensitive to the strength of the relationship
67
+
68
+ Cons:
69
+
70
+ * Can be sensitive to outliers
71
+ * Can be misleading for non-linear relationships
72
+
73
+ Example tasks:
74
+
75
+ * Regression tasks, such as predicting house prices or stock prices
76
+ * Feature selection
77
+
78
+ ## **5. Brier Score**
79
+
80
+ The Brier Score measures the average squared difference between predicted and actual probabilities.
81
+
82
+ Pros:
83
+
84
+ * Sensitive to the quality of the predictions
85
+ * Can handle multi-class classification tasks
86
+
87
+ Cons:
88
+
89
+ * Can be sensitive to the choice of threshold
90
+
91
+ Example tasks:
92
+
93
+ * Multi-class classification tasks, such as image classification
94
+ * Multi-label classification tasks
95
+
96
+ ## **6. F1 Score**
97
+
98
+ The F1 Score measures the harmonic mean of precision and recall.
99
+
100
+ Pros:
101
+
102
+ * Sensitive to the balance between precision and recall
103
+ * Can handle imbalanced datasets
104
+
105
+ Cons:
106
+
107
+ * Can be sensitive to the choice of threshold
108
+
109
+ Example tasks:
110
+
111
+ * Binary classification tasks, such as spam detection
112
+ * Multi-class classification tasks
113
+
114
+ ## **7. Matthews Correlation Coefficient (MCC)**
115
+
116
+ MCC measures the correlation between predicted and actual labels.
117
+
118
+ Pros:
119
+
120
+ * Sensitive to the quality of the predictions
121
+ * Can handle imbalanced datasets
122
+
123
+ Cons:
124
+
125
+ * Can be sensitive to the choice of threshold
126
+
127
+ Example tasks:
128
+
129
+ * Binary classification tasks, such as spam detection
130
+ * Multi-class classification tasks
131
+
132
+ ## **8. Log Loss**
133
+
134
+ Log Loss measures the average log loss between predicted and actual probabilities.
135
+
136
+ Pros:
137
+
138
+ * Sensitive to the quality of the predictions
139
+ * Can handle multi-class classification tasks
140
+
141
+ Cons:
142
+
143
+ * Can be sensitive to the choice of threshold
144
+
145
+ Example tasks:
146
+
147
+ * Multi-class classification tasks, such as image classification
148
+ * Multi-label classification tasks
149
+
150
+ When choosing a metric, consider the specific task, data characteristics, and desired outcome. It's essential to understand the strengths and limitations of each metric to ensure accurate model evaluation.