HyperAI

Non-Metric Distance

Non-metric distanceIt refers to the distance between parameters that do not satisfy directness.

Directness means that for three objects a, b, c, the distance from a to c plus the distance from c to b is greater than or equal to the distance from a directly to b;

Usually we define similarity measures based on some form of distance, the greater the distance, the smaller the similarity.

Non-metric distance and distance calculation

For a distance function, if it is a "distance metric", it must satisfy the following basic properties:

  • Non-negativity: The distance between two points is not negative;
  • Identity: Two points can only have a distance of zero if they coincide in the sample space;
  • Symmetry: the distance from a to b is equal to the distance from b to a;
  • Directness: the distance from a to c plus the distance from c to b is greater than or equal to the distance from a directly to b;

For continuous attributes, the distance between them is generally calculated by the "Minkowski distance".

For discrete attributes, the Minkowski distance can also be used to calculate the ordered values, but when the values are unordered, such as {apple, banana, peach}, VDM (Value Difference Metric) is used for calculation.

VDMp (a, b) represents the p-th power of the difference in distribution ratios of samples with values of a and b in different clusters on attribute u. It approximates the similarity of attributes by the difference in distribution ratios.

The distance calculation of non-metric distance needs to determine the appropriate distance calculation formula based on data samples.