HyperAI

VC Theory

VC DimensionIt is used to measure the capacity of a binary classifier. It represents the maximum number of training samples that the classifier can classify. The intuitive definition is: for an indicator function set, if there are h samples that can be expanded by the functions in the function set in all possible 2h forms, then the function set is said to be able to break up the h samples. The VC dimension of the function set is the maximum number of samples h that can be broken up.

If for any number of samples, there exists a set of functions that can break them all up, then the VC dimension of the function set is infinite. The VC dimension of a bounded real function can be converted into an indicator function through a certain threshold.

Meaning of VC dimension

The larger the value of the VC dimension, the worse its generalization ability is and the greater the confidence risk is. In summary, increasing the number of samples and reducing the VC dimension can reduce the confidence risk.

VC can also reflect the powerfulness of hypothesis H, that is, the larger the VC dimension, the stronger H is because it can break up more points.