HyperAI超神经

By Super Neuro

Generative Adversarial Neural Networks (GANs) are the key to the next step in the development of Deep Learning (ML), and they have great application prospects in many fields.

But the prosperity of GANs still needs to overcome the two mountains of hardware and framework.

What? GAN

For the development of GANs, a possible strategy is to first occupy the market in the image and video fields, and then move to other areas. For example, simulated data sets can be used for HPC (high-performance computer cluster) applications.

However, it is still unknown when the coordinated development of infrastructure and software will be able to adapt to more applications. Even so, the role and influence of GANs are very remarkable. It is enough to complete targeted work and prepare for the next stage of AI.

Some people who are not familiar with it may wonder why we should study GANs when there are already many mature ML methods?

In fact, GANs have achieved results that outperform simple recognition and classification methods, which generate outputs based on references or samples, and the results are extraordinary.

Functionally, GANs are very similar to other convolutional neural networks. The core computation of the discriminator in a GAN is similar to a basic image classifier, while the generator is similar to a convolutional neural network that produces content.

GAN is composed of two deep learning networks: a generator network and a discriminator network, which are actually existing concepts in ML, but they work together in a new way, which is also the uniqueness of GANs.

When working with graphics, the generator takes a dataset and tries to convert it into an image, for example, it synthesizes an image from the data and then passes it to the discriminator, which makes a decision to distinguish whether the image is "real" or "fake."

The generator learns the discriminator's weaknesses from its feedback, and the two achieve better results in the game between them. However, this method makes the calculations required for training more complicated and also faces some new difficulties.

Difficulties with GANs

GANs have excellent performance, but it is not easy to fully utilize them. For example, they may suffer from mode collapse, which will lead to instability in the training and feedback process.

Another common problem is that one network in the adversarial network overwhelms the other. For example, the generator produces images that the discriminator cannot distinguish. In this case, the generator cannot get good feedback and cannot learn effectively.

Fortunately, the problem of combating imbalance can be adjusted in time, but the high requirements for hardware are not so easy to handle.

Training a simple neural network requires a certain amount of computing power, so GANs put pressure on the system, especially memory requirements.

It is difficult to complete this type of work on a machine with only a CPU, and if a GPU is used, we have to face the problem of limited resources in reality.

“GANs require more computing power, and the infrastructure is catching up,” said Bryan Catanzaro, Nvidia’s vice president of applied ML. “You need more data throughput when using GANs because these models can be very large and have many parameters, so training requires a lot of computing power and memory.”

“Many GANs we train are memory-constrained, and even training models with a batch size of one or two will fill up the entire GPU memory because the models are usually so large.”

Good GANs need good saddles

Catanzaro added, “Building a larger system can be helpful when training, and it is also valuable to split batches across multiple GPUs. But this requires a powerful GPU-centric interconnect, such as NVlink, which is used for video GANs on DGX-1.”

In this regard, their work on interactive video generation for games demonstrates the excellent performance of GANs, which can dynamically generate environments in almost real time.

He also mentioned DGX-2, "Once it's ready, it will accelerate our work."

For Nvidia’s work on video synthesis using GANs, the problem of running large models on GPUs is particularly prominent.

“We care about graphics problems and are interested in using them to generate video games as a better way to create content, where you can easily create virtual worlds by training on videos of the real world.”

“But this process is also complicated, especially for video GANs, because it is not just about generating the current image, but also a series of related images. This requires better memory and computing performance.”

For example, we recently talked about the potential of GANs in drug discovery. We found that in addition to the adversarial network, reinforcement learning components and discriminator feedback are also required, which increases the requirements for infrastructure.

Drug startup Insilico Medicine has achieved some success using high-performance GPU clusters to adapt their models in the system, but to go further, it still needs more computing power, more memory, and better memory bandwidth.

The Future of GANs

“GANs of any size could be used in academic, technical, or enterprise contexts beyond image and video generation, but both hardware and software limitations need to be addressed before widespread use cases are possible, and that’s still early days,” Catanzaro said.

“There have been attempts to use GANs in other areas, such as text and audio applications, but the results have not been as good as for images and videos.”

This just goes to show that it’s hard to prove what works before you try it.

“GANs have been very successful in vision so far, which is why they have gained the upper hand in medical imaging,” Catanzaro added.

While it is hoped that more companies can explore a wide range of application spaces beyond images and videos in gaming or content generation, both sides of this platform need more mature conditions.

There seem to be new ideas and advances in GANs research every day, but the lack of applications that can run efficiently on hardware creates a thankless situation.

However, as can be seen from the development of AI, continuous optimization and adjustment may bring distant technologies into our field of vision in the short term.

It’s time to go GAN

Nvidia is leading some of the groundbreaking work on GANs since GPUs are the dominant training platform, which is a challenging training task even with their best DGX systems.

It is not difficult to predict that in the future of graphics and gaming, Nvidia, with its strong capabilities, may change the rules of the game.

But seeing GPUs leap from consumer gaming devices to power accelerators for supercomputers, perhaps what we can learn is that we should not underestimate a technology in research just because it only brings a good gaming experience.

All in all, in the new year, in addition to video and image creation, I hope to see the application of GANs in more fields.

But when using GANs, you may need to equip yourself with sufficient hardware environment first. So, let’s not talk about it, go to GAN! Good luck~

GANs Are Thousands of Times More Complex, but Safety Is the Most Important

What? GAN

Difficulties with GANs

The Future of GANs

It’s time to go GAN