Data Science with Lean six sigma Black Belt

3 Months

● Python for Data Science
● Data Analytics using R
● Statistics and Mathematics for Machine Learning
● Machine Learning in Python
● Supervised Learning
● Unsupervised Learning
● Data Mining
● Association Rules
● Recommendation Engines
● Mature Learning Process
● Six Sigma Overview
● Project Definition
● Project Scoping Tools
● Minitab Intro
● Basic Statistics
● Rolled Throughput Yield
● Process Mapping
● Intro to Lean & Value Stream Mapping
● Process C&E
● Capability Analysis
● Process FMEA
● Graphical Data Analysis
● Correlation and Regression
● Central Limit Theorem
● Confidence Intervals
● Hypothesis Testing
● Sample Size Selection
● One-way ANOVA
● Project Planning & Deliverable
● ntroduction to DOE
● Full Factorial
● 2K Factorials
● DOE Sample Size Selection
● Fractional Factorials
● Catapult Exercise
● Lean Tools for Improvement
● DOE Review
● Multiple Regression
● Logistic Regression
● Survey Design & Analysis
● Intro to Control Methods
● Intro to SPC
● Process Control Plans
● Project Planning & Deliverables

Course Outline

Introduction

  • Python for Data Science

Introduction to Python

Python installation & configuration

Python Features

Basic Python Syntax with implementation

Statements, Indentation, and Comments

  • Data Analytics using R 

Introduction to R

RStudio installation & configuration

Basic Python Syntax 

Basic visualization and data analysis

  • Statistics and Mathematics for Machine Learning

Statistical Inference

Descriptive Statistics

Introduction to Probability, Conditional probability, Bayes theorem

Probability Distribution

Introduction to inferential statistics

Normality, Normal Distribution

Measures of Central Tendencies

Hypothesis Testing

Data visualization using python

  • Machine Learning in Python

Machine Learning introduction

Machine Learning applications & use-cases

Machine Learning Flow

Machine Learning categories

Exploratory data analysis

Data cleaning and Imputation Techniques 

Linear regression

Gradient descent

Model evaluation

  • Supervised Learning 

What is Supervised Learning?

Logistic Regression in Python

Classification & implementations

Decision Tree

Different algorithms for Decision Tree Induction

How to create a Perfect Decision Tree

Confusion Matrix

Random Forest

Tree based Ensemble

Hyper-parameter tuning

Evaluating model output

Naive Bayes Classifier

Support Vector Machine

  • Unsupervised Learning

What is Unsupervised Learning

Clustering

K-means Clustering

Hierarchical Clustering

  • Data Mining
  • Association Rules
  • Recommendation Engines
  •     Mature Learning Process
  •     Six Sigma Overview
  •     Project Definition
  •     Project Scoping Tools
  •     Minitab Intro
  •     Basic Statistics
  •     Rolled Throughput Yield
  •     Process Mapping
  •     Intro to Lean & Value Stream Mapping
  •     Process C&E
  •     Capability Analysis
  •     Process FMEA
  •     Graphical Data Analysis
  •     Correlation and Regression
  •     Central Limit Theorem
  •     Confidence Intervals
  •     Hypothesis Testing
  •     Sample Size Selection
  •     One-way ANOVA
  •     Project Planning & Deliverable
  •     ntroduction to DOE
  •     Full Factorial
  •     2K Factorials
  •     DOE Sample Size Selection
  •     Fractional Factorials
  •     Catapult Exercise
  •     Lean Tools for Improvement
  •     DOE Review
  •     Multiple Regression
  •     Logistic Regression
  •     Survey Design & Analysis
  •     Intro to Control Methods
  •     Intro to SPC
  •     Process Control Plans
  •     Project Planning & Deliverables