In this tutorial, we’ll explore an algorithm called K Nearest Neighbours. This is a widely used machine learning algorithm.
Remember some of the common tasks in ML include:
In classification, the algorithm must learn to predict outcomes which are classes based on one or more features.
Whether a movie will be a hit or flop
Deciding the category of a hurricane
Rating employees as High Performer, Average performer in performance appraisals
Whether a person will say YES to a date
Example of a classification algorithm is Naive Bayes
In regression, the algorithm must learn to predict the values of a response variable based on one or more features.
How much money a movie will make at the box office
The expected wind speed of a hurricane
Products sold by an employee
How many glasses of wine were consumed on a date
Example of regression algorithm is Simple Linear Regression
Both classification and regression are supervised learning tasks. This means we already have labeled data and our model learns from the training set based on the label and predicts outcomes for unseen data.
K-Nearest Neighbors (KNN) algorithm can be used for both classification and regression.
KNN is widely used in the real world in a variety of applications, including search and recommender systems
The K in KNN is a number. Example 3 Nearest neighbours. This number is a representation of training instances in a metric space.
Example, Singapore is most similar to which of the below countries. Brazil, United States, Hong Kong, Slovenia, Japan, Syria, Australia, Italy, Malaysia
You can start to evaluate this on various features. Population size, geographical mass, GDP, housing price etc.,
If we used just the population size and per capita income, which country would Singapore be most similar to? Maybe Hong Kong?
Well, we are data people, so we never speculate. We use data. K Nearest neighbours could help in this case.
We need a way to measure the distance