6 Months
- Statistics
- Data visualization in python
- EDA
- Regression
- Supervised Machine Learning
- Unsupervised Machine Learning
- Ensemble Techniques
- Association Rule
- Recommendation system
- Artificial Neural Network
- Artificial Intelligence Basics
- Neural Networks
- Deep Learning
- Text Mining, Cleaning, and Pre-processing
- Text classification, NLTK, sentiment analysis, etc
- Sentence Structure, Sequence Tagging, Sequence Tasks, and Language Modeling
- RBM and DBNs & Variational AutoEncoder
- Object Detection using Convolutional Neural Net
- Generating images with Neural Style and Working with Deep Generative Models
- Parallel Training, Distributed vs Parallel Computing
- Reinforcement Learning
- Data Visualization with Tableau
- Business Case Studies
- Assignments for assessment
- Projects
- Internship
Course Outline
Statistical Foundations
In this module, you will learn everything you need to know about all the statistical methods used for decision making in this Data Science course.
- Probability distribution – Binomial, Poisson, and Normal Distribution in Python.
- Bayes’ theorem – Baye’s Theorem is a mathematical formula named after Thomas Bayes, which determines conditional probability. Conditional Probability is the probability of an outcome occurring predicated on the previously occurred outcome.
- Central limit theorem – This module will teach you how to estimate a normal distribution using the Central Limit Theorem (CLT).
- Hypothesis testing – This module will teach you about Hypothesis Testing in Statistics. One Sample T-Test, Anova and Chi-Square test.
Exploratory Data Analysis (EDA)
This module of 6 months in Data Science courses will teach you all about Exploratory Data Analysis like Pandas, Seaborn, Matplotlib, and Summary Statistics.
- Pandas – Pandas is one of the most widely used Python libraries. Pandas is used to analyze and manipulate data. This module will give you a deep understanding of exploring data sets using Pandas.
- Summary statistics (mean, median, mode, variance, standard deviation) – In this module, you will learn about various statistical formulas and implement them using Python.
- Seaborn – Seaborn is also one of the most widely used Python libraries. Seaborn is a Matplotlib based data visualization library in Python. This module will give you a deep understanding of exploring data sets using Seaborn.
- Matplotlib – Matplotlib is another widely used Python library. Matplotlib is a library to create statically animated, interactive visualizations. This module will give you a deep understanding of exploring data sets using Matplotlib.
Regression- Linear Regression
This module will get us comfortable with all the techniques used in Linear and Logistic Regression.
- Multiple linear regression – Multiple Linear Regression is used for predicting one dependent variable using various independent variables.
- Fitted regression lines – A fitted regression line is a mathematical regression equation on a graph for your data.
- AIC, BIC, Model Fitting, Training and Test Data – In this module, you will go through everything you need to know about several models such as AIC, BIC, Model Fitting, Training, and Test Data.
Regression- Logistic Regression
- Introduction to Logistic regression, interpretation, odds ratio – It is a simple classification algorithm to predict the categorical dependent variables with the assistance of independent variables.
- Misclassification, Probability, AUC, R-Square – This module will teach everyone how to work with Misclassification, Probability, AUC, and R-Square.
Supervised Machine Learning
In the next module, you will learn all the Supervised Learning techniques used in Machine Learning.
- CART – CART is a predictive machine learning model that describes the prediction of outcome variable’s values predicated on other values.
- KNN – KNN is one of the most straightforward machine learning algorithms for solving regression and classification problems.
- Decision Trees – Decision Tree is a Supervised Machine Learning algorithm used for both classification and regression problems. It is a hierarchical structure where internal nodes indicate the dataset features, branches represent the decision rules, and each leaf node indicates the result.
- Naive Bayes – Naive Bayes Algorithm is used to solve classification problems using Baye’s Theorem.
Unsupervised Learning
In the next module, you will learn all the Unsupervised Learning techniques used in Machine Learning.
- Clustering – K-Means & Hierarchical – Clustering is an unsupervised learning technique involving the grouping of data. In this module, you will learn everything you need to know about the method and its types, like K-means clustering and hierarchical clustering.
- Distance methods – This module will teach you how to work with all the distance methods or measures such as Euclidean, Manhattan, Cosine.
- Features of a Cluster – Labels, Centroids, Inertia – This module will drive you through all the features of a Cluster like Labels, Centroids, and Inertia.
- Eigen vectors and Eigen values – In this module, you will learn how to implement Eigenvectors and Eigenvalues in a matrix.
- Principal component analysis – Principal Component Analysis is a technique to reduce the complexity of a model, like eliminating the number of input variables for a predictive model to avoid overfitting.
Ensemble Techniques
In this Machine Learning, we discuss supervised standalone models’ shortcomings and learn a few techniques, such as Ensemble techniques, to overcome these shortcomings.
- Bagging & Boosting – Bagging is a meta-algorithm in machine learning used for enhancing the stability and accuracy of machine learning algorithms, which are used in statistical classification and regression.
Boosting is a meta-algorithm in machine learning that converts robust classifiers from several weak classifiers. - Random Forest – Random Forest comprises several decision trees on the provided dataset’s several subsets. Then, it calculates the average for enhancing the dataset’s predictive accuracy.
- AdaBoost & Gradient boosting – Boosting can be further classified as Gradient boosting and ADA boosting or Adaptive boosting. This module will teach you about Gradient boosting and ADA boosting.
Association Rules Mining & Recommendation Systems
Association rule mining is the data mining process of finding the rules that may govern associations and causal objects between sets of items.
Recommendation engines are a subclass of machine learning which generally deal with ranking or rating products / users. Loosely defined, a recommender system is a system which predicts ratings a user might give to a specific item. These predictions will then be ranked and returned back to the user.
· Deep Learning Using TensorFlow
Artificial Intelligence Basics
Introduction to keras API and tensorflow
Neural Networks
Neural networks
Multi-layered Neural Networks
Artificial Neural Networks
Deep Learning
Deep neural networks
Convolutional Neural Networks
Recurrent Neural Networks
GPU in deep learning
Autoencoders, restricted boltzmann machine
· Natural Language Processing
Text Mining, Cleaning, and Pre-processing
Various Tokenizers, Tokenization, Frequency Distribution, Stemming, POS Tagging, Lemmatization, Bigrams, Trigrams & Ngrams, Lemmatization, Entity Recognition.
Text classification, NLTK, sentiment analysis, etc
Overview of Machine Learning, Words, Term Frequency, Countvectorizer, Inverse Document Frequency, Text conversion, Confusion Matrix, Naive Bayes Classifier.
Sentence Structure, Sequence Tagging, Sequence Tasks, and Language Modeling
Language Modeling, Sequence Tagging, Sequence Tasks, Predicting Sequence of Tags, Syntax Trees, Context-Free Grammars, Chunking, Automatic Paraphrasing of Texts, Chinking.
· Computer Vision
RBM and DBNs & Variational AutoEncoder
Introduction rbm and autoencoders
Deploying rbm for deep neural networks, using rbm for collaborative filtering
Autoencoders features and applications of autoencoders.
Object Detection using Convolutional Neural Net
Constructing a convolutional neural network using TensorFlow
Convolutional, dense, and pooling layers of CNNs
Filtering images based on user queries
Generating images with Neural Style and Working with Deep Generative Models
Automated conversation bots leveraging
Generative model, and the sequence to sequence model (lstm).
Distributed & Parallel Computing for Deep Learning Models
Parallel Training, Distributed vs Parallel Computing
Distributed computing in Tensorflow, Introduction to tf.distribute
Distributed training across multiple CPUs, Distributed Training
Distributed training across multiple GPUs, Federated Learning
Parallel computing in Tensorflow
Reinforcement Learning
Mapping the human mind with deep neural networks (dnns)
Several building blocks of artificial neural networks (anns)
The architecture of dnn and its building blocks
Reinforcement learning in dnn concepts, various parameters, layers, and optimization algorithms in dnn, and activation functions.
· Data Visualization with Tableau
Data Preparation using Tableau Prep
Data Connection with Tableau Desktop
Basic Visual Analytics
Calculations in Tableau
Advanced Visual Analytics
Level Of Detail (LOD) Expressions in Tableau
Geographic Visualizations in Tableau
Advanced Charts in Tableau
Dashboards and Stories
· Business Case Studies
-Collect the images taken from satellite to Segment Buildings in Images.
-Analysing data regarding soil conditions, including moisture level, temperature, and chemical makeup to predict crop.
- Assignments for assessment
- Projects
- Internship