Date

3 years ago

Exploding Gradients ProblemThis usually occurs in deep networks when the weight initialization value is too large, and it generally becomes more obvious as the number of network layers increases.

By taking the derivative of the activation function, if the result is greater than 1, then as the number of layers increases, the final gradient update will increase exponentially, that is, gradient explosion occurs; if the result is less than 1, then as the number of layers increases, the final gradient update will decay exponentially, that is, gradient disappearance occurs.

The main reasons for gradient explosion and gradient vanishing are that the network is too deep and the network weight update is unstable. Essentially, it is because there is a multiplication effect in the gradient back propagation. For the gradient vanishing problem, you can consider replacing the Sigmoid activation function with the ReLU activation function. In addition, the LSTM structure design can also improve the gradient vanishing problem in RNN.

Solutions to Exploding Gradients

Pre-training plus fine-tuning
Gradient clipping, weight regularization
Using different activation functions
Using Batchnorm
Using residual structure
Using LSTM Network

References

【1】Vanishing and Exploding Gradients in Neural Network Training

【2】Gradient instability problem of deep neural network – gradient vanishing and gradient exploding

【3】Detailed explanation of the causes and solutions of gradient disappearance and explosion in machine learning

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Date

3 years ago

Exploding Gradients ProblemThis usually occurs in deep networks when the weight initialization value is too large, and it generally becomes more obvious as the number of network layers increases.

Solutions to Exploding Gradients

Pre-training plus fine-tuning
Gradient clipping, weight regularization
Using different activation functions
Using Batchnorm
Using residual structure
Using LSTM Network

References

【1】Vanishing and Exploding Gradients in Neural Network Training

【2】Gradient instability problem of deep neural network – gradient vanishing and gradient exploding

【3】Detailed explanation of the causes and solutions of gradient disappearance and explosion in machine learning

Layout Control - Layout-to-Image

Layout-to-Image provides a flexible control mechanism for image generation.

2 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Exploding Gradient Problem

Solutions to Exploding Gradients

References

Build AI with AI

HyperAI Newsletters

Command Palette

Exploding Gradient Problem

Solutions to Exploding Gradients

References

Layout Control - Layout-to-Image

Build AI with AI

HyperAI Newsletters

Command Palette

Exploding Gradient Problem

Solutions to Exploding Gradients

References

Layout Control - Layout-to-Image

Build AI with AI

HyperAI Newsletters

Layout Control - Layout-to-Image

Layout Control - Layout-to-Image