HyperAIHyperAI

Command Palette

Search for a command to run...

Actor-critic Algorithm

Date

7 years ago

Behavior-Criticism Algorithm Actor-Critic Algorithm is a reinforcement learning algorithm that combines a policy network and a value function to calculate the probability of different actions being taken under different states through the reward and punishment information of the results. It is also called the AC algorithm.

The behavior-critic algorithm designs two neural networks, each time updating the parameters in a continuous state, and there is a correlation before and after each parameter update. Compared with the traditional policy network, it has better learning efficiency and performance, but it is prone to bias and can only produce local optimal solutions.

AC Algorithm Advantages

  • Better convergence
  • Higher dimensions and continuous action spaces work better
  • Stochastic strategy can be used

Disadvantages of AC algorithm

  • Usually the local optimal solution is obtained
  • Evaluation strategies are inefficient and have high bias

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Actor-critic Algorithm | Wiki | HyperAI