Data Science and Machine Learning in Python and R – Course Outline(August 1, 2019)

Machine Learning Vs Data Science

This course kicks off August 1, 2019. It will run for 6 to 8 weeks.

Download Course Outline in Excel

You will find much practicals and labs to play with. So take some time to go through the Course outline below and let us know what you think.

Note: Some of the sections may overlap and that is perfectly fine.

 

PART 1 – INTRODUCTION
1 Introduction to AI and Machine Learning Introductory notes. Computer Specification. Explains how the course is arranged, the prerequisites. Relevant applications to be covered. Aspects and branches of AI. AI vs ML vs DL. Machine Learning vs Data Science vs Statistics. Relevance of Mathematics Set expectations for the Course
2 Review of Relevant Math topics Applicable mathematics topics including: Set Theory, Matrices, Calculus(differentiation and integration), Linear and Non-linear algebra, Statistics Ability to solve Math problems by hand
3 Overview of Machine Learning and Some Basic Terms What is Machine Learning? How Machine Learning is applied in problem solving. Conventional analysis and programming approaches versus machine learning. Illustration using the handwritten digits example. Some terms explained: training set, target vector, test set, validation set, feature set  etc. Training or Learning process. Preprocessing. Feature extraction Understand the applications areas of Machine learning and machine learning terms
4 Practical 1  Python Installation and Setup using Anaconda. Setup of RStudio. Introduction to Jupiter Notebook. Using Idle Python. Getting around in the Jupiter Notebook IDE. Performing Mathematical Operations in Python and R. Understand how to setup relevant IDEs Yes

 

PART 2 – PROGRAMMING
5 Practical 2 Installation and Setup of PyCharm (Optional). Programming basics with Python: Data Types, Variables, Loops, Conditional Statements, Lists, Sets, Tuples and Dictionaries. Introduction to Module Be able to write and run Python Programs in an IDE Yes

 

PART 3 –  SUPERVISED LEARNING (REGRESSION)
6 Classes of Machine Learning Problems Supervised, Unsupervised and Reinforcement Learning.  Further subdivisions. Give examples of each Ability to differentiate between various classes of ML problems
7 Practical 3 Dataset Acquisition. Explore online dataset repositories. Import tabular data(txt, csv, xlsx) into Python. Export data from Python to spreadsheets. Getting datasets from R Libraries Scholar should be able to acquire and manipulate datasets Yes
8 Approach to Simple Linear Regression Review of Linear Regression. Illustrate Regression using the Marketing Ads dataset. Deduce the Equation of a regression by parametric approach. States the types of Regression. Concept of Inference and Prediction Be able to perform regression analysis
9 Practical 4 Plotting in Python: Types of Plots. Scholar should be able to create different types of plots in Python Yes
10 Analyzing the Equation of a Regression Line Analyzing the Equation of a linear regression line (y = mx + c). Slope and Intercept
11 Practical 5 Linear Regression demo in Python Yes
12 Polynomial Curve Fitting Linear Regression Review. Polynomial function. Coefficient. Determination of w. Error function Ability to use fit a polynomial curve
13 Overfitting and Underfitting Polynomial curve fitting review. Concept of model complexity. Order of Polynomial being too low or too high. Illustrate using graphs Understand model complexity
14 Practical 6 Hands on on Overfitting and Underfitting in Python Yes

 

PART 4 – PROBABILITY THEORY
15 Introduction to Probability Theory Review of the concept of Probability. Simple examples. Random experiments and Random variables. Some common notations. Take home quiz Be able to explain basic probability theory Quiz
16 Rules of Probability and Bayes’ Theorem Derivation of the rules of Probability. Marginal Probability. Joint Probability and Conditional Probability. (1) Sum Rule (2) Product Rule. Derivation of Bayes’ Theorem. Formulas Derive Bayes’ theory from scratch
17 Application of Bayes’ Theorem Problem scenarios applying Bayes Theorem. Finding Marginal, Joint and Conditional probabilities. Concept of Posterior and Prior. Concept of Independence. The confusion matrix Be able to apply Bayes’ theorem in real scenarios
18 Understanding Probability Density Continuous random variable versus discreet random variable. Probability Density and density functions. Probability distributions. Be able to explain probability distributions

 

PART 5 – SUPERVISED LEARNING (CLASSIFICATION)
19 Practical 7 Building a simple model for prediction using the Advertising Dataset Yes
20 Bias/Variance Trade-off Clear explanation of Model Bias and Variance. The Trade-off. Analysing the Mean-Squared-Error function (MSE). Decomposition of the MSE. Model complexity vs flexibility. Analysis of the Bias/Variance trade-off graph. Review of Underfitting and Overfitting Scholar should be able to analyse the Bias/Variance trade-off graph
21 Quiz on Modules 1 to 20 Quiz on Topics 1 to 20 Quiz
22 Introduction to Classification What is classification? Review of Mean Squared Error in classification. Classification versus Regression. Error rate in classification. Typical classification problem. Cancer diagnosis example. Concept of Inference and Decision Illustrate the and apply the concept of classification
23 Practical 8 Building a Machine Learning model for classification using the Iris Dataset. Yes
24 The Bayes’ Classifier and How it works The Bayes’ Classifier and How it works Clear understanding of Bayes’ classifier
25 K-Nearest Neighbor Classifier Overview of the K-Nearest Neighbors Classifier. How it works. Conpare and Contrast with the Bayes Classifier. Algorithm of the K-Nearest Neighbors classifier (KNN). Illustration/Animation Gain knowledge of other classifiers (KNN)
26 Classification Rate Minimization, ROC and AUC Misclassification in Bayes’ classifier. How to minimize misclassification in Bayes’ classifier. Decision Boundaries. Review of Rules of Probability. Concept of False Positives and False Negatives. Receiver Operating Characteristics Curve (ROC) and Area Under the Curve (AUC) Learn how to improve model performance. Understand ROC and AUC
27 Practical 9 Enterprise Data Analysis Attempt (Stock market data, Tweets etc) Yes
28 Introduction to Multiple Linear Regression Examining the advertising dataset. Importing into Jupyter Notebook. Determination of How each predictor variable affects the response variable. Calculating Regression coefficients. How multiple linear regression works. Some Python practicals Yes

 

PART 6 – UNSUPERVISED LEARNING
29 Introduction to Principal Components Analysis Concept of Unlabeled data. Approaches to dimensionality reduction. Theoretical Overview of PCA. Review of Matrix operations (Multiplication). Determination of the number of components. Scree Plot. Basics of Singular Value Decomposition (SVD) Be able to explain the concept of dimensionality reduction and perform PCA
30 Practical 10 Principal Components Analysis of the Wine Dataset Yes
31 Introduction to Factor Analysis Approaches to Factor Analysis. Test for Sampling Adequacy. KMO Statistics. Scores and Loading. Interpreting Factor Analysis results Gain good knowledge of Factor Analysis
32 Practical 11 Factor Analysis in Python/R Yes
33 Basics of Logistic Regression Review of Classes of ML problems. Basics of Logistic Regression. Probability Distribution. The Logistic Function. Odd and Odds Ratio. Explanation. The log-odds Be able to differentiate between step function and sigmoid function.
34 Introduction to Artificial Neural Networks Background of the Artificial Neural Networks. Review of Calculus(differentiation and partial derivatives). The Sigmoid Neuron. Concept of weights and biases. Network activation and activation function. Hidden layers. The Perceptron. Multilayer Perceptron. Network training. Backpropagation and Gradient decent. Network training algorithms. Gain in-dept knowledge of Neural Networks.
35 Practical 12 Solving a  Neural Network by hand Yes
36 Introduction to Clustering and Cluster Analysis Review of KNN classifier. Types of Clustering: Hierarchical and agglomerative. K-Means clustering. K-nearest neighbor. The algorithms and the tradeoffs.
37 Practical 13 Clustering demo in Python. Building a dendrogram Yes
38 Practical 14 Non-linear modelling using the wage dataset Yes
39 Introduction to Tree-based Modelling Background of decisions trees. Decision trees for classification and regression. Pruning. Assumptions. Splitting Scholar should be able to build a decision tree for classification
40 Review  of all topics
41 Next Steps

3 Comments on “Data Science and Machine Learning in Python and R – Course Outline(August 1, 2019)”

  1. Wonderful beat ! I would like to apprentice while you amend your web site,
    how can i subscribe for a blog website? The account aided me a
    acceptable deal. I had been a little bit acquainted of this your broadcast offered bright clear concept

Leave a Reply

Your email address will not be published. Required fields are marked *