391B Orchard Road #23-01 Ngee Ann City Tower B, Singapore 238874
+ 65 66381203

Certificate Associate in Data Science - Machine Learning Basics

Home»Certificate Associate in Data Science – Machine Learning Basics

Certificate Associate in Data Science - Machine Learning Basics

CADS Machine Learning Basics

Machine Learning is a name that is gaining popularity as an umbrella for methods that have been studied and developed for many decades in different scientific communities and under different names, such as Statistical Learning, Statistical Signal Processing, Pattern Recognition, Adaptive Signal Processing, Image Processing and Analysis, System Identification and Control, Data Mining and Information Retrieval, Computer Vision, and Computational Learning. The name “Machine Learning” indicates what all these disciplines have in common, that is, to learn from data, and then make predictions. What one tries to learn from data is their underlying structure and regularities, via the development of a model, which can then be used to provide predictions.

The goal of this course is to approach the machine learning discipline in a unifying context, by presenting the major paths and approaches that have been followed over the years, without giving preference to a specific one.

This course is an introduction to the world of machine learning, a topic that is becoming more and more important, not only for IT professionals and analysts but also for all those scientists and engineers who want to exploit the enormous power of techniques such as predictive analysis, classification, clustering and natural language processing.

Learning Objectives

After completing this course, you should have the skills and be familiar with the following topics

  • Apply mathematical concepts regarding the most common machine learning problems, including the concept of learnability and some elements of information theory.
  • Explain the process of Machine Learning
  • Describe the most important techniques used to preprocess a dataset, select the most informative features, and reduce the original dimensionality.
  • Describe the structure of a continuous linear model, focusing on the linear regression algorithm. Explain Ridge, Lasso, and ElasticNet optimizations, and other advanced techniques.
  • Describe the concept of linear classification, focusing on logistic regression and stochastic gradient descent algorithms.
  • Demonstrate knowledge of evaluation metrics

Who should attend

Data Analysts, Data Engineers, Data Science Enthusiasts, Business Analysts, Project Managers

Prerequisite

Foundational certificate in Big Data/Data Science

This course is meant for anyone who are comfortable developing applications in Python, and now want to enter the world of data science or wish to build intelligent applications. Aspiring data scientists with some understanding of the Python programming language will also find this course to be very helpful. If you are willing to build efficient data science applications and bring them in the enterprise environment without changing your existing python stack, this course is for you

Delivery Method

Mix of Instructor-led, case study driven and hands-on for select phases

H/w, S/w Reqd

Python, Pandas, Numpy, System with at least 2GB RAM and a Windows /Ubuntu/Mac OS X operating system

Duration

24 Hours (2 days Instructor led + 8 hours online learning)

Enroll Now
  • Course Name:Certificate Associate in Data Science – Machine Learning Basics
  • Location:Singapore
  • Duration:2 days classroom + 8 hours online
  • Exam Time: 60 minutes
  • Course Price: Call for price
  • Minimum requirements: Foundational Certificate in Programming

 ITPACS LogoITPACS Data Science Certification Road Map

Course contents

# Topic Method of Delivery
Day 1
1

1. Introduction to Machine Learning

Introduction – classic and adaptive machines

Types of learning

Supervised learning

Unsupervised learning

Reinforcement learning

Beyond machine learning – deep learning and bio-inspired adaptive systems

Machine learning and big data

Instructor Led
2

2. Important Elements in Machine LearningData formats

Multiclass strategies

One-vs-all

One-vs-one

Learnability

Underfitting and overfitting

Error measures PAC learning

Statistical learning approaches

MAP learning

Maximum-likelihood learning

Instructor Led
3

3. Feature Selection and Feature Engineering

scikit-learn toy datasets

Creating training and test sets

Managing categorical data

Managing missing features

Data scaling and normalization

Feature selection and filtering

Principal component analysis

Non-negative matrix factorization

Sparse PCA

Kernel PCA

Instructor Led
 

Case study

Hands-on session
4

4. Linear Regression

Linear models

A bidimensional example

Linear regression with scikit-learn and higher dimensionality

Regressor analytic expression

Ridge, Lasso, and ElasticNet

Robust regression with random sample consensus

Polynomial regression

Isotonic regression

Instructor Led
 

Case Study

Hands-on session
Day 2
5

5. Logistic Regression

Linear classification

Logistic regression

Implementation and optimizations

Stochastic gradient descent algorithms

Finding the optimal hyperparameters through grid search

Classification metrics

ROC curve

Instructor Led
6

6. Naive Bayes

Bayes’ theorem

Naive Bayes classifiers

Naive Bayes in scikit-learn

Bernoulli naive Bayes

Multinomial naive Bayes

Gaussian naive Bayes

Instructor Led
7

7. Evaluation methods based on the ground truth

Homogeneity 

Completeness

Adjusted rand index

Hands–on session
8

Case Study

Hands–on session
9

Case Project

Hands–on session
10

Assignment

Online Self paced

Certification

  • Certificate Title: Certificate Associate in Data Science – Machine Learning Basics
  • Certificate Awarding Body: ITPACS

About ITPACS

Information Technology Professional Accreditations and Certifications Society (ITPACS) is a non-profit organization focused on improving technology skills for the future. ITPACS offers associate level, professional level and leader certifications across 6 domains including data science, web development, mobile development, cyber security, IoT and blockchain. Applicants have to go through a exam eligibility process demonstrating their experience.

Certification Roadmap

CADS Machine Learning Basics

Eligibility

The Associate certification is catered to individuals with less than 1 year working experience in the field. This is ideal for newcomers starting out in the profession or those seeking to make an entry into the profession. Applicants are required to have completed the application process prior to taking the exam.  

Styling Eligibility

Exam

  • Exam Format: Closed-book format.
    Questions: 30 multiple choice questions, coding exercises
    Passing Score: 65%
    Exam Duration: 60 minutes
    Proctored
  • Exam needs to be taken within 12 months from the exam voucher issue date

ITPACS Certification Training Road Map

Data Science

Data science is not a single science as much as it is a collection of various scientific disciplines integrated for the purpose of analyzing data. These disciplines include various statistical and mathematical techniques, including:

  • Computer science
  • Data engineering
  • Visualization
  • Domain-specific knowledge and approaches

With the advent of cheaper storage technology, more and more data has been collected and stored permitting previously unfeasible processing and analysis of data. With this analysis came the need for various techniques to make sense of the data. These large sets of data, when used to analyze data and identify trends and patterns, become known as big data.

The process of analyzing big data is not simple and evolves to the specialization of developers who were known as data scientists. Drawing upon a myriad of technologies and expertise, they are able to analyze data to solve problems that previously were either not envisioned or were too difficult to solve.

The various data science techniques that we will illustrate have been used to solve a variety of problems. Many of these techniques are motivated to achieve some economic gain, but they have also been used to solve many pressing social and environmental problems. Problem domains where these techniques have been used include finance, optimizing business processes, understanding customer needs, performing DNA analysis, foiling terrorist plots, and finding relationships between transactions to detect fraud, among many other data-intensive problems.

Data mining is a popular application area for data science. In this activity, large quantities of data are processed and analyzed to glean information about the dataset, to provide meaningful insights, and to develop meaningful conclusions and predictions. It has been used to analyze customer behavior, detecting relationships between what may appear to be unrelated events, and to make predictions about future behavior.

Machine learning is an important aspect of data science. This technique allows the computer to solve various problems without needing to be explicitly programmed. It has been used in self-driving cars, speech recognition, and in web searches. In data mining, the data is extracted and processed. With machine learning, computers use the data to take some sort of action.