AI Paper Weekly Report | General Agent Development / Object Detection / Open Source Physics Inference Models... Get a Glimpse Into the Latest AI Developments in One article.

In recent years, the development of Large Language Models (LLMs) has propelled the research frontier from puzzle-solving tasks to scientific reasoning—that is, the ability to handle complex problems where answers must be tested against natural laws, not just scoring criteria. Physics is the most rigorous measure of this shift because it fundamentally connects symbolic systems to the real world and is the cornerstone of most modern technologies.
Based on this, a research team from the Shanghai Artificial Intelligence Laboratory has successfully advanced physics research by developing large-scale language models with outstanding physical reasoning capabilities, particularly excelling in solving Olympiad-level problems. The researchers proposed the P1 series of open-source physical reasoning models, which are trained entirely through reinforcement learning (RL). Among them, P1-235B-A22B is the first open-source model to achieve gold medal-level performance in the 2025 International Physics Olympiad (IPhO 2025), and it won 12 gold medals in 13 international and regional physics competitions from 2024 to 2025.
Paper link:https://go.hyper.ai/NxT8f
Latest AI Papers:https://go.hyper.ai/hzChC
In order to let more users know the latest developments in the field of artificial intelligence in academia, HyperAI's official website (hyper.ai) has now launched a "Latest Papers" section, which updates cutting-edge AI research papers every day.Here are 5 popular AI papers we recommend, let’s take a quick look at this week’s cutting-edge AI achievements⬇️
This week's paper recommendation
1. Lumine: An Open Recipe for Building Generalist Agents in 3D Open World
This paper proposes Lumine, the first open-source general-purpose agent development solution capable of executing complex tasks for hours in real-time in complex 3D open-world environments. Lumine adopts a human-like interaction paradigm, unifying perception, reasoning, and action in an end-to-end manner through a vision-language model. It processes raw pixel input at a frequency of 5 frames per second, generates precise keyboard and mouse operations at 30 frames per second, and dynamically invokes the inference module only when necessary.
Paper link:https://go.hyper.ai/wfGhN

2. YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception
This paper proposes YOLOv13, a high-precision and lightweight object detector. Researchers also propose a hypergraph-based adaptive correlation enhancement mechanism (HyperACE), which adaptively mines potential higher-order correlations, overcoming the limitations of previous methods that were limited to pairwise correlation modeling based on hypergraph computation. This mechanism achieves efficient global cross-location and cross-scale feature fusion and enhancement.
Paper link:https://go.hyper.ai/cKMGI

3. Generating an Image From 1,000 Words Enhancing Text-to-Image With Structured Captions
This paper presents the first open-source text-to-image model, FIBO, based on long structured descriptions, where each training sample is labeled with the same set of fine-grained attributes. This design significantly expands expressive power and achieves decoupled control over visual factors. To efficiently handle long descriptions, the researchers propose the DimFusion mechanism—a fusion method that can fuse intermediate tokens from a lightweight large language model (LLM) without increasing token length.
Paper link:https://go.hyper.ai/zyUcE

4. Depth Anything 3: Recovering the Visual Space from Any Views
This paper proposes Depth Anything 3 (DA3), a model capable of predicting spatially consistent geometry from any number of visual inputs, regardless of whether the inputs contain known camera poses. Researchers constructed a novel visual geometry benchmark covering camera pose estimation, arbitrary viewpoint geometry reconstruction, and visual rendering tasks. On this benchmark, DA3 achieves new state-of-the-art performance across all tasks, with an average improvement of 44.31 TP3T in camera pose estimation accuracy and an average improvement of 25.11 TP3T in geometry reconstruction accuracy compared to the previous state-of-the-art method, VGGT.
Paper link:https://go.hyper.ai/WvSU4

5. P1: Mastering Physics Olympiads with Reinforcement Learning
This paper successfully advances physics research by developing large-scale language models with superior physics reasoning capabilities, particularly excelling in solving Olympiad-level problems. We propose the P1 series of open-source physics reasoning models, which are trained entirely through reinforcement learning (RL).
Paper link:https://go.hyper.ai/NxT8f

The above is all the content of this week’s paper recommendation. For more cutting-edge AI research papers, please visit the “Latest Papers” section of hyper.ai’s official website.
We also welcome research teams to submit high-quality results and papers to us. Those interested can add the NeuroStar WeChat (WeChat ID: Hyperai01).
See you next week!