In the last lecture, we discussed Bayes’ Classifier. Now, we are going to discuss K-Nearest Neighbors Classifier.

Remember that Bayes Classifier tries to classify X depending on the conditional probability of Y given X. However, the conditional distribution of Y over X is not known. Therefore, we can’t actually use Bayes Classifier in practical scenarios.

One approach would be to estimate the conditional distribution of Y given X. Using this, we then classify any observation to the class with the highest estimated probability.

This is how the K-nearest neighbors (KNN) classifier works. Let’s now examine KNN more closely.

**How KNN Works**

Start by choosing initial value of and integer K. That is a certain number of data points in the training data. Then choose test observation, say x_{0}.

Next, the KNN identifies the first K points that are closest to x_{0}. These points form a region N_{0}. Then KNN estimates the conditional probability for a class j to be the fraction of point in N_{0} whose response value equals j. That is points that belong to class j.

This conditional probability is written as:

This equation reads as:

The sum over N_{0 }of the conditional probabilities of Y = j given x_{0}

Finally, Bayes rule is applied to classify the test observation x_{0} to the class with the largest probability.

**Illustrating K-Nearest Neighbors**

Let’s illustrate KNN using an example.

In Figure 1 below, we have a plot of the training data set. It’s made up of 6 blue observations and 6 orange observations. Now, we would like to classify the data point marked with a black cross (x).

We would take the following steps:

**Step 1:** We choose the value of K = 3

**Step 2:** Identify 3 observations that are nearest to the cross. This is shown enclosed in a green circle. It has two blue points and one orange point

**Step 3:** Estimate the probability for each class given the data point (marked with cross) we are trying to classify.

*P(blue class | observation) = 2/3*

*P(orange class | observation) = 1/3*

**Step 4:** Draw a conclusion. Since the the blue class has the highest probability given the observation, therefore we classify the black cross as belonging to the blue class.

This process is repeated until all the datapoints is classified. I recommend you watch the video explanation of this.

However, while K-nearest neighbor does well in classification, it is possible that misclassification can could occur. In the next lesson we’ll see how to minimize misclassification

Hi there! Someone in my Facebook group shared this website

with us so I came to take a look. I’m definitely loving the

information. I’m book-marking and will be tweeting this to my followers!

Outstanding blog and superb style and design.