HyperAI

Attribute Conditional Independence Assumption

The naive Bayes classifier adopts the "attribute conditional independence assumption": for known categories, it is assumed that all attributes are independent of each other.

Improved Naive Bayes:

  1. In order to prevent the information carried by other attributes from being "erased" by attribute values that have never appeared in the training set, "smoothing" is usually performed when estimating probability values, and "Laplace correction" is often used;
  2. The assumption of attribute conditional independence is relaxed to some extent;
  3. The dependency relationship between attributes is characterized by using directed acyclic graphs, and the joint probability distribution of attributes is described using conditional probability tables.

Naive Bayes classifiers are highly scalable, requiring a number of parameters that is linear in the number of variables (features/predictors) in the learning problem. Maximum likelihood training can be done by evaluating a closed-form expression in linear time, rather than requiring the time-consuming iterative approximation used by many other types of classifiers.

In the statistical and computer science literature, the naive Bayes model has various names, including simple Bayes and independent Bayes. All of these names refer to the use of Bayes' theorem in the classifier's decision rule, but naive Bayes does not (necessarily) use Bayesian methods; Russell and Norvig note that "'Naive Bayes' is sometimes called a Bayesian classifier, a sloppy use that has prompted true Bayesians to call it the dumb Bayes model."