HyperAI

Manifold Learning

Manifold LearningIt is a basic method in pattern recognition, which is based on looking for the essence of things in the observed phenomena and finding the internal laws that generate data.

Manifold learning can be divided into two types: linear manifold learning algorithms and nonlinear manifold learning algorithms. Nonlinear manifold learning algorithms include isomap, Laplacian eigenmaps, and locally linear embedding. Linear methods include principal component analysis and multidimensional scaling.

Isometric Mapping

The goal of Isomap is to find the corresponding low-dimensional embedding for a given high-dimensional manifold, so that the neighbor structure between data points on the high-dimensional manifold is preserved in the low-dimensional embedding. When Isomap calculates the distance between data points on a high-dimensional manifold, it uses the geodesic distance in differential geometry.

advantage:

  • The solution process relies on the eigenvalue and eigenvector problems of linear algebra, which ensures the robustness and global optimality of the results;
  • The residual variance can be used to determine the essential dimension of the underlying low-dimensional embedding.
  • The Isomap method only needs to determine a single parameter (neighbor parameter k or neighborhood radius e) during calculation.

Laplacian Eigenmap

The Laplace eigenmap uses an undirected weighted graph to describe a manifold, and then finds a low-dimensional representation by embedding the graph. It is the fastest, but the effect is relatively unsatisfactory.

Locally Linear Embedding

Local linear embedding is a milestone in nonlinear dimensionality reduction, and its algorithm can be summarized into three steps:

  • Find the k nearest neighbors of each sample point;
  • The local reconstruction weight matrix of each sample point is calculated from the neighboring points of the sample point;
  • The output value of the sample point is calculated by the local reconstruction weight matrix of the sample point and its neighboring points.

Principal component analysis

New variables are obtained by linearly combining the original variables. The variance between these variables is the largest. Since the difference between the original variables of the data may not be large and the descriptions are similar, the efficiency is low.

Multidimensional Scaling

Multidimensional scaling analysis is to express the observed data with fewer dimensions, but it uses the similarity between paired samples to construct a suitable low-dimensional space so that the similarity between samples and high-dimensional space is as consistent as possible.

The multidimensional scaling analysis method has five key elements, namely subject, object, criterion, criterion weight, and subject weight, as follows:

  • Object: The object being evaluated. It can be considered as several categories to be classified.
  • Subject: The unit of evaluation object. It is the training data.
  • Criteria: Standards defined by oneself according to the purpose of the research and used to evaluate the quality of the object.
  • Criteria weight: After weighing the importance of the criteria, the subject assigns a weight value to each criterion.
  • Subject weight: After weighing the importance of the criteria, the researcher assigns a weight value to the subject.