HyperAIHyperAI

Command Palette

Search for a command to run...

LRZ Adopts Nvidia Engines For €250 Million “Blue Lion” Supercomputer In 2027

The Leibniz-Rechenzentrum (LRZ), a leading supercomputing center located in Garching, Germany, has announced plans to build a €250 million hybrid CPU-GPU supercomputer called "Blue Lion" in 2027. This ambitious project, which will be manufactured by Hewlett Packard Enterprise (HPE) using a future Cray EX design, is expected to deliver 7 exaflops of computing power per second, marking a significant leap forward in both high-performance computing (HPC) and artificial intelligence (AI) capabilities for Europe. ### Key Elements of the Blue Lion Project: - **Budget and Funding**: The total budget for Blue Lion, including operating costs from 2027 to 2032, is €250 million. The Federal Ministry of Education and Research (BMBF) and the Bavarian State Ministry of Science and the Arts are each contributing half of the funding. - **Computing Power**: Blue Lion is slated to have approximately 30 times the computing power of LRZ’s current flagship system, SuperMUC-NG. The machine is expected to handle 7 exaflops of computing power per second, though the specific precision levels for this figure are unclear. - **Architecture and Components**: The supercomputer will utilize future CPUs and GPUs from Nvidia, likely the "Vera" Arm server CPU and the "Rubin" GPU accelerator. HPE’s Slingshot 400 interconnect, which offers 400 Gb/sec, will be integrated into the system, ensuring robust communication between nodes. - **Cooling System**: To manage the high energy consumption and heat generation, Blue Lion will employ 100 percent direct liquid cooling using water at 40 degrees Celsius. This cooling system will not only extract heat but also be used to heat the LRZ offices and other nearby organizations, making it an environmentally friendly and efficient solution. ### Performance Estimates: - **CPU and GPU Configuration**: Initial estimates suggest that Blue Lion will have around 2,200 nodes, each equipped with one Vera CPU and two Rubin GPUs. Assuming the Vera CPU has 144 cores and 768 GB of LPDDR5 main memory, the system will maintain a similar number of CPU cores to SuperMUC-NG Phase 1 while significantly reducing the number of physical nodes. This reduction in node count will likely result in fewer parts and improved efficiency. - **Memory and Performance**: The aggregate memory capacity across the CPU nodes is expected to rise by 4.6 times, and the peak FP64 performance on the tensor cores in the Rubin GPUs is estimated to increase by 26.5 times. However, the High Performance LINPACK (HPL) benchmark performance might only increase by 23.8 times due to the shift to a GPU-accelerated architecture. The HPCG performance, which is a more aggressive test, is projected to increase by about 30.5 times. - **Exaflops Calculation**: The 7 exaflops figure mentioned by the Bavarian prime minister is likely not an FP64 number. Given the performance of current Nvidia GPUs and the projected improvements for Rubin, the exaflops figure could be achieved at lower precision levels, such as FP16 for sparse matrix operations. However, without more detailed information, it is challenging to confirm these estimates. ### Historical Context and Strategic Shift: - **LRZ’s Past**: LRZ has traditionally been an Intel shop, using X86 processors in its supercomputers since 2003. The lab’s SuperMUC-NG Phase 2, installed in 2023, included Intel’s "Ponte Vecchio" GPU Max 1550 accelerators, marking its first significant foray into GPU acceleration. - **New Alliances**: The move to Nvidia’s Arm-based CPUs and GPUs represents a strategic shift for LRZ, signaling the end of the Intel era in its flagship systems. This change could be influenced by several factors, including the performance and efficiency benefits of the new architecture, as well as the potential geopolitical risks associated with relying on Lenovo, which has both American and Chinese ownership. - **HPE’s Role**: HPE, which has never had a major deal with LRZ for a flagship machine, will manufacture Blue Lion. This choice might reflect the strong performance of HPE’s Cray EX design and the company’s commitment to providing advanced interconnect solutions, such as the Slingshot 400. ### Implications and Future Outlook: - **Competition and Pricing**: The increasing competition in the AI and HPC markets, driven by hyperscalers and cloud builders developing their own accelerators, is likely to impact pricing and performance. Nvidia, which currently commands a premium, may need to lower its prices to remain competitive, benefiting HPC centers like LRZ with improved price/performance ratios. - **Environmental Impact**: The use of direct liquid cooling and the recycling of heat for office heating aligns with LRZ’s commitment to environmental sustainability and could serve as a model for future supercomputing centers. - **Research and Innovation**: Blue Lion’s advanced capabilities will enable LRZ to tackle complex simulations and AI models, potentially leading to breakthroughs in various scientific and technological fields. The hybrid CPU-GPU architecture is designed to support both traditional HPC workloads and emerging AI applications. ### Conclusion: The Blue Lion supercomputer project at LRZ is a significant milestone in the evolution of European supercomputing. With its powerful hybrid architecture, direct liquid cooling, and substantial funding, Blue Lion is poised to deliver unprecedented computational capabilities. The shift away from Intel and towards Nvidia’s Arm-based technology marks a new era in LRZ’s supercomputing strategy, driven by the need for performance, efficiency, and potential geopolitical considerations. As the details of the project become clearer, the impact of Blue Lion on the HPC and AI landscape will become more apparent, potentially setting new standards for supercomputing centers worldwide.

Related Links