Class Ai4r::Classifiers::NaiveBayes
In: lib/ai4r/classifiers/naive_bayes.rb
Parent: Classifier
 = Introduction

 This is an implementation of a Naive Bayesian Classifier without any
 specialisation (ie. for text classification)
 Probabilities P(a_i | v_j) are estimated using m-estimates, hence the
 m parameter as second parameter when isntantiating the class.
 The estimation looks like this:

(n_c + mp) / (n + m)

 the variables are:
 n = the number of training examples for which v = v_j
 n_c = number of examples for which v = v_j and a = a_i
 p = a priori estimate for P(a_i | v_j)
 m = the equivalent sample size

 stores the conditional probabilities in an array named @pcp and in this form:
 @pcp[attributes][values][classes]

 This kind of estimator is useful when the training data set is relatively small.
 If the data set is big enough, set it to 0, which is also the default value

 For further details regarding Bayes and Naive Bayes Classifier have a look at those websites:
 http://en.wikipedia.org/wiki/Naive_Bayesian_classification
 http://en.wikipedia.org/wiki/Bayes%27_theorem

 = Parameters

 * :m => Optional. Default value is set to 0. It may be set to a value greater than 0 when
 the size of the dataset is relatively small

 = How to use it

   data = DataSet.new.load_csv_with_labels "bayes_data.csv"
   b = NaiveBayes.new.
     set_parameters({:m=>3}).
     build data
   b.eval(["Red", "SUV", "Domestic"])

Methods

build   eval   get_probability_map   new  

Classes and Modules

Class Ai4r::Classifiers::NaiveBayes::DataEntry

Public Class methods

Public Instance methods

counts values of the attribute instances and calculates the probability of the classes and the conditional probabilities Parameter data has to be an instance of CsvDataSet

You can evaluate new data, predicting its category. e.g.

  b.eval(["Red", "SUV", "Domestic"])
    => 'No'

Calculates the probabilities for the data entry Data. data has to be an array of the same dimension as the training data minus the class column. Returns a map containint all classes as keys: {Class_1 => probability, Class_2 => probability2 … } Probability is <= 1 and of type Float. e.g.

  b.get_probability_map(["Red", "SUV", "Domestic"])
    => {"Yes"=>0.4166666666666667, "No"=>0.5833333333333334}

[Validate]