Data Science and Machine Learning in Python and R – Course Outline(August 1, 2019)

This course kicks off August 1, 2019. It will run for 6 to 8 weeks.

Download Course Outline in Excel

You will find much practicals and labs to play with. So take some time to go through the Course outline below and let us know what you think.

Note: Some of the sections may overlap and that is perfectly fine.

 

PART 1 – INTRODUCTION
1Introduction to AI and Machine LearningIntroductory notes. Computer Specification. Explains how the course is arranged, the prerequisites. Relevant applications to be covered. Aspects and branches of AI. AI vs ML vs DL. Machine Learning vs Data Science vs Statistics. Relevance of MathematicsSet expectations for the Course
2Review of Relevant Math topicsApplicable mathematics topics including: Set Theory, Matrices, Calculus(differentiation and integration), Linear and Non-linear algebra, StatisticsAbility to solve Math problems by hand
3Overview of Machine Learning and Some Basic TermsWhat is Machine Learning? How Machine Learning is applied in problem solving. Conventional analysis and programming approaches versus machine learning. Illustration using the handwritten digits example. Some terms explained: training set, target vector, test set, validation set, feature set  etc. Training or Learning process. Preprocessing. Feature extractionUnderstand the applications areas of Machine learning and machine learning terms
4Practical 1 Python Installation and Setup using Anaconda. Setup of RStudio. Introduction to Jupiter Notebook. Using Idle Python. Getting around in the Jupiter Notebook IDE. Performing Mathematical Operations in Python and R.Understand how to setup relevant IDEsYes

 

PART 2 – PROGRAMMING
5Practical 2Installation and Setup of PyCharm (Optional). Programming basics with Python: Data Types, Variables, Loops, Conditional Statements, Lists, Sets, Tuples and Dictionaries. Introduction to ModuleBe able to write and run Python Programs in an IDEYes

 

PART 3 –  SUPERVISED LEARNING (REGRESSION)
6Classes of Machine Learning ProblemsSupervised, Unsupervised and Reinforcement Learning.  Further subdivisions. Give examples of eachAbility to differentiate between various classes of ML problems
7Practical 3Dataset Acquisition. Explore online dataset repositories. Import tabular data(txt, csv, xlsx) into Python. Export data from Python to spreadsheets. Getting datasets from R LibrariesScholar should be able to acquire and manipulate datasetsYes
8Approach to Simple Linear RegressionReview of Linear Regression. Illustrate Regression using the Marketing Ads dataset. Deduce the Equation of a regression by parametric approach. States the types of Regression. Concept of Inference and PredictionBe able to perform regression analysis
9Practical 4Plotting in Python: Types of Plots.Scholar should be able to create different types of plots in PythonYes
10Analyzing the Equation of a Regression LineAnalyzing the Equation of a linear regression line (y = mx + c). Slope and Intercept
11Practical 5Linear Regression demo in PythonYes
12Polynomial Curve FittingLinear Regression Review. Polynomial function. Coefficient. Determination of w. Error functionAbility to use fit a polynomial curve
13Overfitting and UnderfittingPolynomial curve fitting review. Concept of model complexity. Order of Polynomial being too low or too high. Illustrate using graphsUnderstand model complexity
14Practical 6Hands on on Overfitting and Underfitting in PythonYes

 

PART 4 – PROBABILITY THEORY
15Introduction to Probability TheoryReview of the concept of Probability. Simple examples. Random experiments and Random variables. Some common notations. Take home quizBe able to explain basic probability theoryQuiz
16Rules of Probability and Bayes’ TheoremDerivation of the rules of Probability. Marginal Probability. Joint Probability and Conditional Probability. (1) Sum Rule (2) Product Rule. Derivation of Bayes’ Theorem. FormulasDerive Bayes’ theory from scratch
17Application of Bayes’ TheoremProblem scenarios applying Bayes Theorem. Finding Marginal, Joint and Conditional probabilities. Concept of Posterior and Prior. Concept of Independence. The confusion matrixBe able to apply Bayes’ theorem in real scenarios
18Understanding Probability DensityContinuous random variable versus discreet random variable. Probability Density and density functions. Probability distributions.Be able to explain probability distributions

 

PART 5 – SUPERVISED LEARNING (CLASSIFICATION)
19Practical 7Building a simple model for prediction using the Advertising DatasetYes
20Bias/Variance Trade-offClear explanation of Model Bias and Variance. The Trade-off. Analysing the Mean-Squared-Error function (MSE). Decomposition of the MSE. Model complexity vs flexibility. Analysis of the Bias/Variance trade-off graph. Review of Underfitting and OverfittingScholar should be able to analyse the Bias/Variance trade-off graph
21Quiz on Modules 1 to 20Quiz on Topics 1 to 20Quiz
22Introduction to ClassificationWhat is classification? Review of Mean Squared Error in classification. Classification versus Regression. Error rate in classification. Typical classification problem. Cancer diagnosis example. Concept of Inference and DecisionIllustrate the and apply the concept of classification
23Practical 8Building a Machine Learning model for classification using the Iris Dataset.Yes
24The Bayes’ Classifier and How it worksThe Bayes’ Classifier and How it worksClear understanding of Bayes’ classifier
25K-Nearest Neighbor ClassifierOverview of the K-Nearest Neighbors Classifier. How it works. Conpare and Contrast with the Bayes Classifier. Algorithm of the K-Nearest Neighbors classifier (KNN). Illustration/AnimationGain knowledge of other classifiers (KNN)
26Classification Rate Minimization, ROC and AUCMisclassification in Bayes’ classifier. How to minimize misclassification in Bayes’ classifier. Decision Boundaries. Review of Rules of Probability. Concept of False Positives and False Negatives. Receiver Operating Characteristics Curve (ROC) and Area Under the Curve (AUC)Learn how to improve model performance. Understand ROC and AUC
27Practical 9Enterprise Data Analysis Attempt (Stock market data, Tweets etc)Yes
28Introduction to Multiple Linear RegressionExamining the advertising dataset. Importing into Jupyter Notebook. Determination of How each predictor variable affects the response variable. Calculating Regression coefficients. How multiple linear regression works. Some Python practicalsYes

 

PART 6 – UNSUPERVISED LEARNING
29Introduction to Principal Components AnalysisConcept of Unlabeled data. Approaches to dimensionality reduction. Theoretical Overview of PCA. Review of Matrix operations (Multiplication). Determination of the number of components. Scree Plot. Basics of Singular Value Decomposition (SVD)Be able to explain the concept of dimensionality reduction and perform PCA
30Practical 10Principal Components Analysis of the Wine DatasetYes
31Introduction to Factor AnalysisApproaches to Factor Analysis. Test for Sampling Adequacy. KMO Statistics. Scores and Loading. Interpreting Factor Analysis resultsGain good knowledge of Factor Analysis
32Practical 11Factor Analysis in Python/RYes
33Basics of Logistic RegressionReview of Classes of ML problems. Basics of Logistic Regression. Probability Distribution. The Logistic Function. Odd and Odds Ratio. Explanation. The log-oddsBe able to differentiate between step function and sigmoid function.
34Introduction to Artificial Neural NetworksBackground of the Artificial Neural Networks. Review of Calculus(differentiation and partial derivatives). The Sigmoid Neuron. Concept of weights and biases. Network activation and activation function. Hidden layers. The Perceptron. Multilayer Perceptron. Network training. Backpropagation and Gradient decent. Network training algorithms.Gain in-dept knowledge of Neural Networks.
35Practical 12Solving a  Neural Network by handYes
36Introduction to Clustering and Cluster AnalysisReview of KNN classifier. Types of Clustering: Hierarchical and agglomerative. K-Means clustering. K-nearest neighbor. The algorithms and the tradeoffs.
37Practical 13Clustering demo in Python. Building a dendrogramYes
38Practical 14Non-linear modelling using the wage datasetYes
39Introduction to Tree-based ModellingBackground of decisions trees. Decision trees for classification and regression. Pruning. Assumptions. SplittingScholar should be able to build a decision tree for classification
40Review  of all topics
41Next Steps
Admin bar avatar

kindsonthegenius

Kindson Munonye is currently completing his doctoral program in Software Engineering in Budapest University of Technology and Economics

View all posts by kindsonthegenius →

3 thoughts on “Data Science and Machine Learning in Python and R – Course Outline(August 1, 2019)

  1. Wonderful beat ! I would like to apprentice while you amend your web site,
    how can i subscribe for a blog website? The account aided me a
    acceptable deal. I had been a little bit acquainted of this your broadcast offered bright clear concept

Leave a Reply

Your email address will not be published. Required fields are marked *