Building a Naive Bayes Classifier from scratch

2 min readMar 28, 2019

Naive Bayes Classifier is probably the most widely used text classifier, it’s a supervised learning algorithm. It can be used to classify blog posts or news articles into different categories like sports, entertainment and so forth. It can be used to detect spam emails. But most important is that it’s widely implemented in Sentiment analysis. So first of all what is supervised learning? It means that the labeled training dataset is provided along with the input and the respective output. From this training dataset, our algorithm infers the next outcome to a given input.

The basics,

Conditional Probability: It is simply the probability that something will happen, given that something else has happened. It’s a way to handle dependent events. You can check out some examples of conditional probability here.

So from the multiplication rule; (here A and B are dependent events)

P(A ∪ B) = P(A) · P(B|A)

Now from the above equation, we get the formula for conditional probability;

P(B|A) = P(A ∪ B) / P(A)

Bayes’ theorem: It describes the probability of an event based on the conditions or attributes that might be related to the event,

P(A|B) = [P(B|A) · P(A)] / P(B)

So, our classifier can be written as :

Assume a problem instance to be classified, represented by vector x = (x1, x2, x3, …. , xn) representing some n attributes. Here y is our class variable.

Here we have eliminated the denominator P(x1, x2, x3, …. , xn) because it doesn’t really contribute to our final solution.

Now to make sure our algorithm holds up good against our datasets, we need to take the following conditions into account.

The Zero Frequency problem: Let us consider the case where a given attribute or class never occur together in the training data, causing the frequency-based probability estimate to be zero. This small condition will wipe out the entire information in other probabilities when multiplied (multiplied by zero…duh…!). The simple solution to it is to apply Laplace estimation by assuming a uniform distribution over all attributes ie. we simply add a pseudo count in all probability estimates such that no probability is ever set to zero.

Floating Point Underflow: The probability values can go out of the floating point range hence to avoid this we need to take logarithms on the probabilities. Accordingly, we need to apply logarithmic rules to our classifier.

I have implemented Naive Bayes Classifier in Python and you can find it on Github. If have any improvements to add or any suggestions let me know in the comments section below.

Code Github: https://github.com/5hirish/naive_bayes_classifier

Refer :

As originally published on April 23rd, 2016 shirishkadam.com

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Shirish Kadam

42 Followers

391 Following

Product Management | Productivity | Stoicism

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

More from Shirish Kadam

Shirish Kadam

Dependency Parsing in NLP

Syntactic Parsing or Dependency Parsing is the task of recognizing a sentence and assigning a syntactic structure to it. The most widely…

Mar 31, 2019

NLP: Question Classification using Support Vector Machines [spacy][scikit-learn][pandas]

Voice Tech Podcast

Shirish Kadam

NLP: Question Classification using Support Vector Machines [spacy][scikit-learn][pandas]

The past couple of months I have been working on a Question Answering System and in my upcoming blog posts, I would like to share some…

Apr 1, 2019

‘Pattern Recognition’ a tool for your Go To Market strategy planning | PM Notes

Shirish Kadam

‘Pattern Recognition’ a tool for your Go To Market strategy planning | PM Notes

Tom Morkes, discusses in detail with a use case, how he identifies the pain points or the needs of his target audience and based on those…

May 22, 2021

A Cognitive study of Lexicons in Natural Language Processing.

Shirish Kadam

A Cognitive study of Lexicons in Natural Language Processing.

What are Lexicons?

Mar 30, 2019

See all from Shirish Kadam

Recommended from Medium

Interpreting Support Vector Machine Coefficients: A Comprehensive Analysis

D.H. Jang

Interpreting Support Vector Machine Coefficients: A Comprehensive Analysis

In the rapidly advancing landscape of artificial intelligence (AI) and machine learning (ML), specific methodologies and their…

Nov 3, 2024

Understanding Decision Boundaries in Machine Learning

Okeshakarunarathne

Understanding Decision Boundaries in Machine Learning

When training a machine learning model for classification tasks, one of the most important concepts to grasp is the decision boundary. It…

Sep 27, 2024

Lists

Predictive Modeling w/ Python

20 stories1858 saves

Practical Guides to Machine Learning

10 stories2229 saves

Coding & Development

11 stories1035 saves

Natural Language Processing

1977 stories1622 saves

12 Python Built-in Functions I Wish I Knew Earlier!

Aashish Kumar

12 Python Built-in Functions I Wish I Knew Earlier!

Discover 12 built-in functions that can make your coding life easier and more efficient.

4d ago

Part -02: Understanding Path Parameters in FastApi

Alen Sabu

Part -02: Understanding Path Parameters in FastApi

What are Path Parameters?

Mar 4

Jo Wang

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) is a statistical technique used for dimensionality reduction while retaining most of the variance…

Oct 21, 2024

Fashion Recommendation System using Image Features and Python

Python in Plain English

Sri Varshan

Fashion Recommendation System using Image Features and Python

A Fashion Recommendation System using Image Features leverages computer vision and Machine Learning techniques to analyze fashion items’…

Jan 29

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams