HyperAIHyperAI

Command Palette

Search for a command to run...

AI Transparency Declines in 2025, with IBM Leading Amid Industry-Wide Drop

A new report from Stanford University and collaborating institutions, titled the 2025 Foundation Model Transparency Index (FMTI), reveals a concerning trend: despite ongoing improvements in model performance, the overall transparency of foundational AI models has declined significantly. IBM leads the rankings with the highest transparency score of 95, while xAI and Midjourney rank at the bottom with just 14 points. The report, now in its third annual edition, serves as a comprehensive “health check” for major AI developers, evaluating their openness in areas such as data sourcing, model training, and post-deployment impact. The average transparency score across all evaluated models dropped from 58 in 2024 to 40 in 2025—nearly matching the 2023 baseline. This marks a clear regression, despite earlier signs of progress. The report highlights that companies are increasingly withholding information on key aspects: training data composition, computational resources used, and real-world model behavior after deployment. While some areas like model capabilities and risk assessments have seen more disclosure, critical dimensions such as methodological transparency, third-party validation, reproducibility, and training-test data overlap remain poorly addressed. The 2025 assessment includes 13 companies, among them new entrants like DeepSeek and Alibaba, as well as established players such as OpenAI, IBM, and xAI. This year’s index also introduces new metrics focused on data access, data usage practices, and ongoing monitoring of model impact, offering a more holistic view of transparency. Notably, certain company characteristics correlate with higher scores. Firms that are open-source focused, operate in B2B markets, proactively publish transparency reports, or have signed the EU’s General Purpose AI Code of Practice tend to perform better. In contrast, companies that offer closed, API-only access with minimal public documentation—like xAI and Midjourney—scored the lowest. But does higher transparency mean a better model? According to Professor Han Qiu from Tsinghua University, transparency in this context is not equivalent to open-source status. Instead, it is a quantified evaluation based on specific criteria. A score of zero in a category like “data usage” or “post-deployment monitoring” does not mean no information is shared—it means the required data or methodology was not disclosed according to the index’s defined standards. This distinction is crucial: transparency is not a direct proxy for model performance, security, or safety. For instance, IBM’s Granite 3.3, which scored the highest in transparency, is not among the top-performing models in real-world benchmarks. This raises a practical question: if you need to deploy a model for a specific task, would you choose a high-transparency model like Granite 3.3, or a more powerful but less transparent one like Qwen3 or Claude? The answer depends on the use case—whether safety, auditability, or raw performance is prioritized. Qiu emphasizes that transparency is foundational to safety and accountability. Without it, risks such as copyright violations, privacy breaches, and embedded biases become harder to detect and verify. Worse, when serious incidents occur, they may trigger overly broad regulatory responses that stifle innovation. The report concludes that while full disclosure may not always be feasible, the ultimate goal should be a system where models are not required to be fully open, but their behavior is measurable, claims are verifiable, and safety is ensured. As AI continues to integrate into critical sectors, building trust requires not just technical rigor, but clear accountability frameworks. Ultimately, the decline in transparency signals a growing need for stronger incentives—both market-driven and policy-supported—to encourage responsible disclosure. The FMTI serves as both a warning and a roadmap: transparency is not just a moral imperative, but a practical necessity for sustainable and trustworthy AI development.

Related Links