6 Months

- Statistics
- Data visualization in python
- EDA
- Regression
- Supervised Machine Learning
- Unsupervised Machine Learning
- Ensemble Techniques
- Association Rule
- Recommendation system
- Artificial Neural Network
- CNN
- Time Series Analysis
- Trend and seasonality
- Decomposition
- Smoothing
- SES, Holt & Holt-Winter Model
- AR, Lag Series, ACF, PACF
- Sequences and Prediction
- Deep Neural Networks for Time Series
- Recurrent Neural Networks for Time Series

● Stationarity and Time Series Smoothing

● ARMA and ARIMA Models

● Survival Analysis Forecasts

● Assignments for assessment

● Projects

● Internship

Course Outline

**Statistical Foundations**

In this module, you will learn everything you need to know about all the statistical methods used for decision making in this Data Science course.

**Probability distribution –**Binomial, Poisson, and Normal Distribution in Python.**Bayes’ theorem –**Baye’s Theorem is a mathematical formula named after Thomas Bayes, which determines conditional probability. Conditional Probability is the probability of an outcome occurring predicated on the previously occurred outcome.**Central limit theorem –**This module will teach you how to estimate a normal distribution using the Central Limit Theorem (CLT).**Hypothesis testing –**This module will teach you about Hypothesis Testing in Statistics. One Sample T-Test, Anova and Chi-Square test.

**Exploratory Data Analysis (EDA)**

This module of 6 months in Data Science courses will teach you all about Exploratory Data Analysis like Pandas, Seaborn, Matplotlib, and Summary Statistics.

**Pandas –**Pandas is one of the most widely used Python libraries. Pandas is used to analyze and manipulate data. This module will give you a deep understanding of exploring data sets using Pandas.**Summary statistics (mean, median, mode, variance, standard deviation) –**In this module, you will learn about various statistical formulas and implement them using Python.**Seaborn –**Seaborn is also one of the most widely used Python libraries. Seaborn is a Matplotlib based data visualization library in Python. This module will give you a deep understanding of exploring data sets using Seaborn.**Matplotlib –**Matplotlib is another widely used Python library. Matplotlib is a library to create statically animated, interactive visualizations. This module will give you a deep understanding of exploring data sets using Matplotlib.

**Regression- Linear Regression**

This module will get us comfortable with all the techniques used in Linear and Logistic Regression.

**Multiple linear regression –**Multiple Linear Regression is used for predicting one dependent variable using various independent variables.**Fitted regression lines –**A fitted regression line is a mathematical regression equation on a graph for your data.**AIC, BIC, Model Fitting, Training and Test Data –**In this module, you will go through everything you need to know about several models such as AIC, BIC, Model Fitting, Training, and Test Data.

**Regression- Logistic Regression**

**Introduction to Logistic regression, interpretation, odds ratio –**It is a simple classification algorithm to predict the categorical dependent variables with the assistance of independent variables.- Misclassification, Probability, AUC, R-Square – This module will teach everyone how to work with Misclassification, Probability, AUC, and R-Square.

**Supervised Machine Learning **

In the next module, you will learn all the Supervised Learning techniques used in Machine Learning.

**CART –**CART is a predictive machine learning model that describes the prediction of outcome variable’s values predicated on other values.**KNN –**KNN is one of the most straightforward machine learning algorithms for solving regression and classification problems.**Decision Trees –**Decision Tree is a Supervised Machine Learning algorithm used for both classification and regression problems. It is a hierarchical structure where internal nodes indicate the dataset features, branches represent the decision rules, and each leaf node indicates the result.**Naive Bayes –**Naive Bayes Algorithm is used to solve classification problems using Baye’s Theorem.

**Unsupervised Learning**

In the next module, you will learn all the Unsupervised Learning techniques used in Machine Learning.

**Clustering – K-Means & Hierarchical –**Clustering is an unsupervised learning technique involving the grouping of data. In this module, you will learn everything you need to know about the method and its types, like K-means clustering and hierarchical clustering.**Distance methods –**This module will teach you how to work with all the distance methods or measures such as Euclidean, Manhattan, Cosine.**Features of a Cluster – Labels, Centroids, Inertia –**This module will drive you through all the features of a Cluster like Labels, Centroids, and Inertia.**Eigen vectors and Eigen values –**In this module, you will learn how to implement Eigenvectors and Eigenvalues in a matrix.**Principal component analysis –**Principal Component Analysis is a technique to reduce the complexity of a model, like eliminating the number of input variables for a predictive model to avoid overfitting.

**Ensemble Techniques**

In this Machine Learning, we discuss supervised standalone models’ shortcomings and learn a few techniques, such as Ensemble techniques, to overcome these shortcomings.

**Bagging & Boosting –**Bagging is a meta-algorithm in machine learning used for enhancing the stability and accuracy of machine learning algorithms, which are used in statistical classification and regression.

Boosting is a meta-algorithm in machine learning that converts robust classifiers from several weak classifiers.**Random Forest –**Random Forest comprises several decision trees on the provided dataset’s several subsets. Then, it calculates the average for enhancing the dataset’s predictive accuracy.**AdaBoost & Gradient boosting –**Boosting can be further classified as Gradient boosting and ADA boosting or Adaptive boosting. This module will teach you about Gradient boosting and ADA boosting.

**Association Rules Mining & Recommendation Systems**

Association rule mining is the data mining process of finding the rules that may govern associations and causal objects between sets of items.

Recommendation engines are a subclass of machine learning which generally deal with ranking or rating products / users. Loosely defined, a recommender system is a system which predicts ratings a user might give to a specific item. These predictions will then be ranked and returned back to the user.

**Understanding to Deep Learning – Single Layer Perceptron**

Artificial neural networks, usually simply called neural networks or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain.

**Convolutional Neural Network**

A convolutional neural network is a feed-forward neural network that is generally used to analyze visual images by processing data with grid-like topology. It’s also known as a ConvNet. A convolutional neural network is used to detect and classify objects in an image.

**Time Series Analysis**

In this lesson, you will learn –

**Trend and seasonality –**Trend is a systematic linear or non-linear component in Time Series metrics, which changes over a while and does not repeat.

Seasonality is a systematic linear or non-linear component in Time Series metrics, which changes over a while and repeats.**Decomposition –**This module will teach you how to decompose the time series data into Trend and Seasonality.**Smoothing (moving average) –**This module will teach you how to use this method for univariate data.**SES, Holt & Holt-Winter Model –**SES, Holt, and Holt-Winter Models are various Smoothing models, and you will learn everything you need to know about these models in this module.**AR, Lag Series, ACF, PACF –**In this module, you will learn about AR, Lag Series, ACF, and PACF models used in Time Series.**ADF, Random walk and Auto Arima –**In this module, you will learn about ADF, Random walk, and Auto Arima techniques used in Time Series.

**Sequences and Prediction**

Sequence prediction is a problem that involves using historical sequence information to predict the next value or values in the sequence.

**Deep Neural Networks for Time Series**

Neural networks have been successfully used for forecasting data series. Neural Networks have the advantage that they can approximate nonlinear functions.

**Recurrent Neural Networks for Time Series**

In RNNs, the signals passing through recurrent connections constitute an effective memory for the network, which can then use the information in memory to better predict the future time series values.

**Stationarity and Time Series Smoothing**

Smoothing is a technique applied to time series to remove the fine-grained variation between time steps. The hope of smoothing is to remove noise and better expose the signal of the underlying causal processes.

**ARMA and ARIMA Models**

ARMA is a model of forecasting in which the methods of autoregression (AR) analysis and moving average (MA) are both applied to time-series data that is well behaved. An ARIMA model is a class of statistical models for analyzing and forecasting time series data.

**Survival Analysis Forecasts**

The goal of survival analysis is to predict time until an event happens and estimate the survival probability.

- Assignments for assessment
- Projects
- Internship