Connectionist Temporal Classification
Connectionist Temporal Classification (CTC) is a loss function and modeling method widely used in sequence-to-sequence learning tasks, especially for scenarios where the lengths of the input sequence and the output sequence are inconsistent and the alignment relationship is unknown. This method was first proposed in 2006 and has been widely used in speech recognition, handwriting recognition, action recognition and other fields.
The main goal of CTC is to train a neural network model to output a label sequence that is aligned with the input sequence without explicit alignment between the input sequence and the output sequence. By introducing blank labels and dynamic programming algorithms, the alignment relationship between the input sequence and the output label is automatically learned to achieve end-to-end sequence modeling.