HyperAIHyperAI

Command Palette

Search for a command to run...

2 days ago

Evolution Strategies at the Hyperscale

Evolution Strategies at the Hyperscale

Abstract

We introduce Evolution Guided General Optimization via Low-rank Learning (EGGROLL), an evolution strategies (ES) algorithm designed to scale backprop-free optimization to large population sizes for modern large neural network architectures with billions of parameters. ES is a set of powerful blackbox optimisation methods that can handle non-differentiable or noisy objectives with excellent scaling potential through parallelisation. Na{ï}ve ES becomes prohibitively expensive at scale due to the computational and memory costs associated with generating matrix perturbations and the batched matrix multiplications needed to compute per-member forward passes. EGGROLL overcomes these bottlenecks by generating random matrices with to form a low-rank matrix perturbation that are used in place of the full-rank perturbation . As the overall update is an average across a population of workers, this still results in a high-rank update but with significant memory and computation savings, reducing the auxiliary storage from to per layer and the cost of a forward pass from to when compared to full-rank ES. A theoretical analysis reveals our low-rank update converges to the full-rank update at a fast rate. Our experiments show that (1) EGGROLL does not compromise the performance of ES in tabula-rasa RL settings, despite being faster, (2) it is competitive with GRPO as a technique for improving LLM reasoning, and (3) EGGROLL enables stable pre-training of nonlinear recurrent language models that operate purely in integer datatypes.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Evolution Strategies at the Hyperscale | Papers | HyperAI