HyperAI超神经

The human attention mechanism is based on intuition. It is a means for humans to use limited resources to quickly filter out high-value information from a large amount of information. The attention mechanism in deep learning draws on the human attention thinking mode and is widely used in various scenarios such as natural language processing, image classification, speech recognition, etc., and has achieved remarkable results.

Encoder-Decoder frame

Encoder-Decoder It is a very common model framework in deep learning. Image Caption In the application Encoder-Decoder that is CNN-RNN Encoding– Decoding framework; in neural machine translation models Encoder-Decoder Often it is LSTM-LSTM Encoding– Decode the frame.

Encoding is to encode the input sequence into a vector of fixed length; decoding is to decode the previously generated fixed vector into an output sequence.

Attention Model

The Attention model is mainly used in deep learning, which is mainly divided into three major areas: natural language understanding, image recognition, and speech recognition.

Natural Language Understanding

The Attention model plays a key role in natural language understanding. Google has adopted the Attention model in its latest machine translation, which is mainly used to extract keywords from long sentences or paragraphs, as shown below:

Image Recognition

In image recognition, the Attention model is used for image classification and image generation. The following figure is an application of image caption generation:

In this study, the weights of the Attention model are visualized and displayed in the original image, that is, the white area in the image. From the image, we can see that the frisbee and dog correspond to the frisbee and dog in the sentence respectively.

Speech Recognition

The Encoder-Decoder framework based on the Attention model achieved good results, and it also established the correspondence between speech and words.

Attention Mechanism

Encoder-Decoder frame

Attention Model