File size: 14,029 Bytes
48e7c56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500







BSc: Introduction To Machine Learning
=====================================






Contents
--------


* [1 Introduction to Machine Learning](#Introduction_to_Machine_Learning)
	+ [1.1 Short Description](#Short_Description)
	+ [1.2 Prerequisites](#Prerequisites)
		- [1.2.1 Prerequisite subjects](#Prerequisite_subjects)
		- [1.2.2 Prerequisite topics](#Prerequisite_topics)
	+ [1.3 Course Topics](#Course_Topics)
	+ [1.4 Intended Learning Outcomes (ILOs)](#Intended_Learning_Outcomes_.28ILOs.29)
		- [1.4.1 What is the main purpose of this course?](#What_is_the_main_purpose_of_this_course.3F)
		- [1.4.2 ILOs defined at three levels](#ILOs_defined_at_three_levels)
			* [1.4.2.1 Level 1: What concepts should a student know/remember/explain?](#Level_1:_What_concepts_should_a_student_know.2Fremember.2Fexplain.3F)
			* [1.4.2.2 Level 2: What basic practical skills should a student be able to perform?](#Level_2:_What_basic_practical_skills_should_a_student_be_able_to_perform.3F)
			* [1.4.2.3 Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios?](#Level_3:_What_complex_comprehensive_skills_should_a_student_be_able_to_apply_in_real-life_scenarios.3F)
	+ [1.5 Grading](#Grading)
		- [1.5.1 Course grading range](#Course_grading_range)
		- [1.5.2 Course activities and grading breakdown](#Course_activities_and_grading_breakdown)
		- [1.5.3 Recommendations for students on how to succeed in the course](#Recommendations_for_students_on_how_to_succeed_in_the_course)
	+ [1.6 Resources, literature and reference materials](#Resources.2C_literature_and_reference_materials)
		- [1.6.1 Open access resources](#Open_access_resources)
		- [1.6.2 Closed access resources](#Closed_access_resources)
		- [1.6.3 Software and tools used within the course](#Software_and_tools_used_within_the_course)
* [2 Teaching Methodology: Methods, techniques, & activities](#Teaching_Methodology:_Methods.2C_techniques.2C_.26_activities)
	+ [2.1 Activities and Teaching Methods](#Activities_and_Teaching_Methods)
	+ [2.2 Formative Assessment and Course Activities](#Formative_Assessment_and_Course_Activities)
		- [2.2.1 Ongoing performance assessment](#Ongoing_performance_assessment)
			* [2.2.1.1 Section 1](#Section_1)
			* [2.2.1.2 Section 2](#Section_2)
			* [2.2.1.3 Section 3](#Section_3)
			* [2.2.1.4 Section 4](#Section_4)
		- [2.2.2 Final assessment](#Final_assessment)
		- [2.2.3 The retake exam](#The_retake_exam)



Introduction to Machine Learning
================================


* **Course name**: Introduction to Machine Learning
* **Code discipline**: R-01
* **Subject area**:


Short Description
-----------------


This course covers the following concepts: Machine learning paradigms; Machine Learning approaches, and algorithms.



Prerequisites
-------------


### Prerequisite subjects


* CSE202 — Analytical Geometry and Linear Algebra I
* CSE204 — Analytical Geometry and Linear Algebra II
* CSE201 — Mathematical Analysis I
* CSE203 — Mathematical Analysis II
* CSE206 — Probability And Statistics
* CSE117 — Data Structures and Algorithms: python, numpy, basic object-oriented concepts, memory management.


### Prerequisite topics


Course Topics
-------------




Course Sections and Topics
| Section | Topics within the section
 |
| --- | --- |
| Supervised Learning | 1. Introduction to Machine Learning
2. Derivatives and Cost Function
3. Data Pre-processing
4. Linear Regression
5. Multiple Linear Regression
6. Gradient Descent
7. Polynomial Regression
8. Bias-varaince Tradeoff
9. Difference between classification and regression
10. Logistic Regression
11. Naive Bayes
12. KNN
13. Confusion Metrics
14. Performance Metrics
15. Regularization
16. Hyperplane Based Classification
17. Perceptron Learning Algorithm
18. Max-Margin Classification
19. Support Vector Machines
20. Slack Variables
21. Lagrangian Support Vector Machines
22. Kernel Trick
 |
| Decision Trees and Ensemble Methods | 1. Decision Trees
2. Bagging
3. Boosting
4. Random Forest
5. Adaboost
 |
| Unsupervised Learning | 1. K-means Clustering
2. K-means++
3. Hierarchical Clustering
4. DBSCAN
5. Mean-shift
 |
| Deep Learning | 1. Artificial Neural Networks
2. Back-propagation
3. Convolutional Neural Networks
4. Autoencoder
5. Variatonal Autoencoder
6. Generative Adversairal Networks
 |


Intended Learning Outcomes (ILOs)
---------------------------------


### What is the main purpose of this course?


There is a growing business need of individuals skilled in artificial intelligence, data analytics, and machine learning. Therefore, the purpose of this course is to provide students with an intensive treatment of a cross-section of the key elements of machine learning, with an emphasis on implementing them in modern programming environments, and using them to solve real-world data science problems.



### ILOs defined at three levels


#### Level 1: What concepts should a student know/remember/explain?


By the end of the course, the students should be able to ...



* Different learning paradigms
* A wide variety of learning approaches and algorithms
* Various learning settings
* Performance metrics
* Popular machine learning software tools


#### Level 2: What basic practical skills should a student be able to perform?


By the end of the course, the students should be able to ...



* Difference between different learning paradigms
* Difference between classification and regression
* Concept of learning theory (bias/variance tradeoffs and large margins etc.)
* Kernel methods
* Regularization
* Ensemble Learning
* Neural or Deep Learning


#### Level 3: What complex comprehensive skills should a student be able to apply in real-life scenarios?


By the end of the course, the students should be able to ...



* Classification approaches to solve supervised learning problems
* Clustering approaches to solve unsupervised learning problems
* Ensemble learning to improve a model’s performance
* Regularization to improve a model’s generalization
* Deep learning algorithms to solve real-world problems


Grading
-------


### Course grading range





| Grade | Range | Description of performance
 |
| --- | --- | --- |
| A. Excellent | 90-100 | -
 |
| B. Good | 75-89 | -
 |
| C. Satisfactory | 60-74 | -
 |
| D. Poor | 0-59 | -
 |


### Course activities and grading breakdown





| Activity Type | Percentage of the overall course grade
 |
| --- | --- |
| Labs/seminar classes | 0
 |
| Interim performance assessment | 40
 |
| Exams | 60
 |


### Recommendations for students on how to succeed in the course


Resources, literature and reference materials
---------------------------------------------


### Open access resources


* T. Hastie, R. Tibshirani, D. Witten and G. James. An Introduction to Statistical Learning. Springer 2013.
* T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer 2011.
* Tom M Mitchel. Machine Learning, McGraw Hill
* Christopher M. Bishop. Pattern Recognition and Machine Learning, Springer


### Closed access resources


### Software and tools used within the course


Teaching Methodology: Methods, techniques, & activities
=======================================================


Activities and Teaching Methods
-------------------------------




Activities within each section
| Learning Activities | Section 1 | Section 2 | Section 3 | Section 4
 |
| --- | --- | --- | --- | --- |
| Development of individual parts of software product code | 1 | 1 | 1 | 1
 |
| Homework and group projects | 1 | 1 | 1 | 1
 |
| Midterm evaluation | 1 | 1 | 1 | 1
 |
| Testing (written or computer based) | 1 | 1 | 1 | 1
 |
| Discussions | 1 | 1 | 1 | 1
 |


Formative Assessment and Course Activities
------------------------------------------


### Ongoing performance assessment


#### Section 1





| Activity Type | Content | Is Graded?
 |
| --- | --- | --- |
| Question | Is it true that in simple linear regression 






R

2






{\displaystyle {\textstyle R^{2}}}

{\displaystyle {\textstyle R^{2}}} and the squared correlation between X and Y are identical? | 1
 |
| Question | What are the two assumptions that the Linear regression model makes about the Error Terms? | 1
 |
| Question | Fit a regression model to a given data problem, and support your choice of the model. | 1
 |
| Question | In a list of given tasks, choose which are regression and which are classification tasks. | 1
 |
| Question | In a given graphical model of binary random variables, how many parameters are needed to define the Conditional Probability Distributions for this Bayes Net? | 1
 |
| Question | Write the mathematical form of the minimization objective of Rosenblatt’s perceptron learning algorithm for a two-dimensional case. | 1
 |
| Question | What is perceptron learning algorithm? | 1
 |
| Question | Write the mathematical form of its minimization objective for a two-dimensional case. | 1
 |
| Question | What is a max-margin classifier? | 1
 |
| Question | Explain the role of slack variable in SVM. | 1
 |
| Question | How to implement various regression models to solve different regression problems? | 0
 |
| Question | Describe the difference between different types of regression models, their pros and cons, etc. | 0
 |
| Question | Implement various classification models to solve different classification problems. | 0
 |
| Question | Describe the difference between Logistic regression and naive bayes. | 0
 |
| Question | Implement perceptron learning algorithm, SVMs, and its variants to solve different classification problems. | 0
 |
| Question | Solve a given optimization problem using the Lagrange multiplier method. | 0
 |


#### Section 2





| Activity Type | Content | Is Graded?
 |
| --- | --- | --- |
| Question | What are pros and cons of decision trees over other classification models? | 1
 |
| Question | Explain how tree-pruning works. | 1
 |
| Question | What is the purpose of ensemble learning? | 1
 |
| Question | What is a bootstrap, and what is its role in Ensemble learning? | 1
 |
| Question | Explain the role of slack variable in SVM. | 1
 |
| Question | Implement different variants of decision trees to solve different classification problems. | 0
 |
| Question | Solve a given classification problem problem using an ensemble classifier. | 0
 |
| Question | Implement Adaboost for a given problem. | 0
 |


#### Section 3





| Activity Type | Content | Is Graded?
 |
| --- | --- | --- |
| Question | Which implicit or explicit objective function does K-means implement? | 1
 |
| Question | Explain the difference between k-means and k-means++. | 1
 |
| Question | Whaat is single-linkage and what are its pros and cons? | 1
 |
| Question | Explain how DBSCAN works. | 1
 |
| Question | Implement different clustering algorithms to solve to solve different clustering problems. | 0
 |
| Question | Implement Mean-shift for video tracking | 0
 |


#### Section 4





| Activity Type | Content | Is Graded?
 |
| --- | --- | --- |
| Question | What is a fully connected feed-forward ANN? | 1
 |
| Question | Explain different hyperparameters of CNNs. | 1
 |
| Question | Calculate KL-divergence between two probability distributions. | 1
 |
| Question | What is a generative model and how is it different from a discriminative model? | 1
 |
| Question | Implement different types of ANNs to solve to solve different classification problems. | 0
 |
| Question | Calculate KL-divergence between two probability distributions. | 0
 |
| Question | Implement different generative models for different problems. | 0
 |


### Final assessment


**Section 1**



1. What does it mean for the standard least squares coefficient estimates of linear regression to be scale equivariant?
2. Given a fitted regression model to a dataset, interpret its coefficients.
3. Explain which regression model would be a better fit to model the relationship between response and predictor in a given data.
4. If the number of training examples goes to infinity, how will it affect the bias and variance of a classification model?
5. Given a two dimensional classification problem, determine if by using Logistic regression and regularization, a linear boundary can be estimated or not.
6. Explain which classification model would be a better fit to for a given classification problem.
7. Consider the Leave-one-out-CV error of standard two-class SVM. Argue that under a given value of slack variable, a given mathematical statement is either correct or incorrect.
8. How does the choice of slack variable affect the bias-variance tradeoff in SVM?
9. Explain which Kernel would be a better fit to be used in SVM for a given data.


**Section 2**



1. When a decision tree is grown to full depth, how does it affect tree’s bias and variance, and its response to noisy data?
2. Argue if an ensemble model would be a better choice for a given classification problem or not.
3. Given a particular iteration of boosting and other important information, calculate the weights of the Adaboost classifier.


**Section 3**



1. K-Means does not explicitly use a fitness function. What are the characteristics of the solutions that K-Means finds? Which fitness function does it implicitly minimize?
2. Suppose we clustered a set of N data points using two different specified clustering algorithms. In both cases we obtained 5 clusters and in both cases the centers of the clusters are exactly the same. Can 3 points that are assigned to different clusters in one method be assigned to the same cluster in the other method?
3. What are the characterics of noise points in DBSCAN?


**Section 4**



1. Explain what is ReLU, what are its different variants, and what are their pros and cons?
2. Calculate the number of parameters to be learned during training in a CNN, given all important information.
3. Explain how a VAE can be used as a generative model.


### The retake exam


**Section 1**


**Section 2**


**Section 3**


**Section 4**