## Machine Learning – K Nearest Neighbours Algorithm

In this tutorial, we’ll explore an algorithm called K Nearest Neighbours. This is a widely used machine learning algorithm.

Remember some of the common tasks in ML include:

Classification

Regression

Clustering

Dimensionality Reduction

In classification, the algorithm must learn to predict outcomes which are classes based on one or more features.

Example,

Whether a movie will be a hit or flop

Deciding the category of a hurricane

Rating employees as High Performer, Average performer in performance appraisals

Whether a person will say YES to a date

Example of a classification algorithm is Naive Bayes

In regression, the algorithm must learn to predict the values of a response variable based on one or more features.

Example:

How much money a movie will make at the box office

The expected wind speed of a hurricane

Products sold by an employee

How many glasses of wine were consumed on a date

Example of regression algorithm is Simple Linear Regression

Both classification and regression are supervised learning tasks. This means we already have labeled data and our model learns from the training set based on the label and predicts outcomes for unseen data.

K-Nearest Neighbors (KNN) algorithm can be used for both classification and regression.

KNN is widely used in the real world in a variety of applications, including search and recommender systems

The K in KNN is a number. Example 3 Nearest neighbours. This number is a representation of training instances in a metric space.

Example, Singapore is most similar to which of the below countries. Brazil, United States, Hong Kong, Slovenia, Japan, Syria, Australia, Italy, Malaysia

You can start to evaluate this on various features. Population size, geographical mass, GDP, housing price etc.,

If we used just the population size and per capita income, which country would Singapore be most similar to? Maybe Hong Kong?

Well, we are data people, so we never speculate. We use data. K Nearest neighbours could help in this case.

We need a way to measure the distance