NVIDIA Unveils 18 BMC Vulnerabilities, Highlighting Critical Security Risks in Data Centers
Modern data centers rely heavily on Baseboard Management Controllers (BMCs) for remote management, enabling administrators to monitor hardware, reconfigure system firmware, and recover machines even when systems are powered down. However, these powerful features also make BMCs a significant security risk. If compromised, BMCs can provide persistent, stealthy access to an entire fleet of devices, often undetected by traditional security monitoring systems. The NVIDIA Offensive Security Research (OSR) team recently conducted a thorough analysis of BMC firmware commonly used in data center environments. They identified 18 vulnerabilities and developed nine working exploits, highlighting the potential for serious security breaches. These vulnerabilities range from credential handling flaws to memory corruption bugs and expose BMCs to attacks that can lead to full system control and data theft. One of the primary issues the team uncovered is a hash-leak vulnerability in the Intelligent Platform Management Interface (IPMI) authentication process, similar to one identified in 2013 (CVE-2013-4786). This vulnerability can be exploited to reveal valid usernames through a timing side channel, as the BMC’s use of memcmp during authentication leaks information about the correctness of entered characters. Once a valid username is known, attackers can brute-force the password offline using the leaked hash and standard word lists. The firmware image also revealed that the user database is managed by Redis, with encrypted passwords but stored alongside decryption keys. The NVIDIA team discovered multiple APIs that allowed them to query and manipulate this database, decrypting passwords and obtaining the complete user database. Another API enabled read/write access to the virtual memory within the IPMI server process, exploiting the lack of Address Space Layout Randomization (ASLR). This allowed the team to identify and toggle a hidden configuration flag, enabling file download features and accessing sensitive files like /etc/shadow. With full control over the BMC, the team explored interactions with the host system. They used the BMC's KVM functionality to modify bootloader parameters and gain shell access to the host operating system, bypassing the need for user credentials. Secure Boot, a feature designed to prevent unauthorized boot loaders, was not enabled in this instance. Even if it were, the team found that UEFI settings could be adjusted through the BMC, potentially disabling Secure Boot. Similarly, an exposed API allowed the BMC to read and write directly to the host’s SPI flash, modifying Non-Volatile RAM (NVRAM) entries and further compromising the host system. The team also discovered a pre-authentication stack-based buffer overflow in the BMC’s shared telemetry library. The use of strcpy to copy unvalidated input into a fixed-size buffer, followed by a function pointer call, provided a direct path to code execution. The absence of modern memory protection mechanisms, such as stack canaries and non-executable stacks, made exploitation straightforward. The team developed an exploit that fully hijacked control flow, allowing the execution of injected shellcode during login attempts. To address these vulnerabilities, the NVIDIA OSR team worked closely with American Megatrends Inc. (AMI), the vendor responsible for the affected BMC firmware. They provided detailed technical reports, facilitating rapid patching and coordination of fixes across the vendor’s customer base. NVIDIA also issued its own Common Vulnerabilities and Exposures (CVEs) to expedite awareness and remediation, ensuring that affected NVIDIA customers could take immediate action. Security teams must prioritize BMCs in their security models, treating them as privileged systems with deep control over hardware. Key steps include: Implementing Strong Authentication: Ensure that all BMCs use strong, unique passwords and enable multi-factor authentication (MFA). Regular Patching and Updates: Keep BMC firmware up-to-date with the latest security patches. Enabling Security Features: Activate features like ASLR, stack canaries, and Secure Boot to prevent common attack vectors. Monitoring BMC Traffic: Regularly monitor and audit BMC network traffic for suspicious activities. Securing Management Interfaces: Use secure, encrypted connections for BMC management interfaces and limit access to trusted networks. Industry experts praise NVIDIA’s proactive approach to security research and collaboration with vendors. By identifying and disclosing critical vulnerabilities, NVIDIA contributes to enhancing the overall security posture of data center infrastructure. The company’s commitment to deep technical security assessments and continuous improvement sets a high standard for the technology sector. Understanding and securing BMCs is crucial for maintaining the integrity and resilience of modern data centers. To delve deeper into this research, the full paper titled "Breaking BMC: The Forgotten Key to the Kingdom" is available for review. For additional insights into securing other layers of your stack, consider exploring the latest NVIDIA GTC conference sessions on advanced security topics.