HyperAI

### Abstract of Recent Highlights from AIR (Institute for Artificial Intelligence Research) at Tsinghua University Since its establishment, the Institute for Artificial Intelligence Research (AIR) at Tsinghua University has focused on three core research areas: Smart Transportation, Smart Healthcare, and Smart Internet of Things (IoT). The institute has made significant contributions, publishing 88 high-level papers in prestigious international journals and conferences such as PNAS, CVPR, NeurIPS, ICLR, and MobiSys. These publications have garnered several awards, including the MobiSys 2021 Best Paper Award, a CVPR 2021 Best Student Paper Nomination, and the AAAI-IAAI 2022 Innovative Application of Artificial Intelligence Award. #### Smart Transportation 1. **DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection** - **Authors:** Hai Bao Yu, Yizhen Luo, Mao Shu, Yi Yi Huo, Zebang Yang, Yifeng Shi, Zhenglong Guo, Hanyu Li, Xing Hu, Jirui Yuan, Zaiqing Nie - **Affiliations:** AIR, Baidu, Department of Computer Science, Tsinghua University, University of Chinese Academy of Sciences - **Conference:** CVPR 2022 - **Summary:** Autonomous driving systems face significant safety challenges due to blind spots and unstable long-range perception. To address these issues, the DAIR-V2X dataset, a large-scale, multi-perspective, and multi-modal collection of 71,254 frames with 3D annotations, has been released. This dataset, derived from real-world scenarios, is designed to promote data-driven vehicle-infrastructure cooperative autonomous driving. The dataset includes both vehicle and infrastructure perspectives, along with fused annotations, and defines a new 3D detection task, VIC3D Object Detection, which aims to enhance 3D detection accuracy while minimizing communication bandwidth. The dataset is available for download, and the team plans to release the benchmark implementation code soon. This work was supported by the Beijing Advanced Autonomous Driving Demonstration Zone, Baidu Apollo, and the Beijing Academy of Artificial Intelligence. 2. **Constraints Penalized Q-Learning for Safe Offline Reinforcement Learning** - **Authors:** Haoran Xu, Xianyuan Zhan (Corresponding Author), Xiangyu Zhu - **Affiliations:** JD Technology, AIR, Xidian University - **Conference:** AAAI 2022 - **Summary:** Offline reinforcement learning (RL) is a promising approach for learning policies directly from historical data without interacting with the real environment, making it suitable for real-world applications. However, ensuring safety while maximizing reward is a significant challenge. The team introduces CPQ, a new Q-learning algorithm that penalizes actions violating safety constraints. CPQ adds an extra loss term to the risk Q-function to elevate the risk of out-of-distribution actions and uses an indicator function to reduce the value Q-function of unsafe actions. Theoretical analysis proves CPQ's convergence and robustness to changes in safety constraints. Empirical results show that CPQ outperforms existing benchmarks in reward maximization and training stability. 3. **Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing** - **Authors:** Xiaoxue Chen, Tianyu Liu, Hao Zhao, Guyue Zhou, Ya Qin Zhang - **Affiliations:** AIR, Hong Kong University of Science and Technology, Peking University, Intel Labs - **Conference:** CVPR 2022 - **Summary:** Multi-task indoor scene understanding is a critical area in computer vision. The team proposes Cerberus Transformer, a novel attention-based architecture that jointly performs semantic, affordance, and attribute parsing. The model captures long-range dependencies, learns from weakly aligned data, and balances sub-tasks during training. Cerberus achieves state-of-the-art performance in all three tasks and demonstrates robustness with only 0.1%-1% of annotated data. The code and models are available on GitHub. 4. **PQ-Transformer: Jointly Parsing 3D Objects and Layouts from Point Clouds** - **Authors:** Xiaoxue Chen, Hao Zhao, Guyue Zhou, Ya Qin Zhang - **Affiliations:** AIR, Peking University, Intel Labs - **Conference:** RA-L+ICRA 2022 - **Summary:** Point cloud-based 3D scene understanding is essential for various robotic applications. The team introduces PQ-Transformer, the first Transformer network to simultaneously predict 3D objects and room layouts from point cloud inputs. The model parameterizes room layouts as quadrilaterals and incorporates a physical constraint loss function to prevent object-layout intersections. Experimental results on the ScanNet dataset show significant improvements in F1-score and overall accuracy. The code and models are available on GitHub. #### Smart Healthcare 1. **Deep Learning Guided Optimization of Human Antibody Against SARS-CoV-2 Variants with Broad Neutralization** - **Authors:** Sisi Shan, Shutong Luo, Ziqing Yang, Junxian Hong, Yufeng Su, Fan Ding, Lili Fu, Chenyu Li, Peng Chen, Jianzhu Ma, Xuanling Shi, Qi Zhang, Bonnie Berger, Linqi Zhang, Jian Peng - **Affiliations:** School of Medicine, Tsinghua University, HuaShenZhiYao Biotechnology (Beijing) Co., Ltd., University of Illinois at Urbana-Champaign, Massachusetts Institute of Technology, AIR - **Journal:** PNAS - **Summary:** Viral mutations can escape immune system attacks, making the development of broad-spectrum neutralizing antibodies challenging. The team uses a geometric deep learning algorithm to optimize the human antibody P36-5D2, which neutralizes Alpha, Beta, and Gamma variants but not Delta. The optimized antibody shows enhanced affinity to multiple variants, including Delta, by 10 to 600 times. The algorithm can detect changes in the antibody's complementary determining regions (CDRs) to mitigate the impact of viral mutations. The optimized antibodies have potential for use in therapeutics against current SARS-CoV-2 variants. 2. **Contribution-Aware Federated Learning for Smart Healthcare** - **Authors:** Zelei Liu, Yuanyuan Chen, Yansong Zhao, Han Yu, Yang Liu, Renyi Bao, Jinping Jiang, Zaiqing Nie, Qian Xu, Qiang Yang - **Affiliations:** Nanyang Technological University, AIR, Yidu Cloud, Webank - **Conference:** AAAI-IAAI 2022 - **Award:** AAAI-IAAI 2022 Innovative Application of Artificial Intelligence Award - **Summary:** The team proposes a contribution-aware federated learning framework for smart healthcare, verified in real-world scenarios with Yidu Cloud. The framework fairly evaluates participants' contributions to model performance without exposing private data and improves the training protocol by assigning the best intermediate models to participants. The method accelerates the analysis by 2.84 times and enhances accuracy by 2.62%, significantly advancing the application of AI in healthcare. 3. **Equivariant Graph Mechanics Networks with Constraints** - **Authors:** Wenbing Huang, Jiaqi Han, Yu Rong, Tingyang Xu, Fuchun Sun, Junzhou Huang - **Affiliations:** AIR, Department of Computer Science, Tsinghua University, Tencent AI Lab, University of Texas at Arlington - **Conference:** ICLR 2022 - **Summary:** Modeling multi-body interactions and their dynamics is crucial in various scientific fields, from molecular dynamics to robotics. The team introduces Equivariant Graph Mechanics Networks (GMN), a novel graph neural network that is invariant to transformations like translation, rotation, and reflection and satisfies rigid body constraints. GMN outperforms other methods in predicting the evolution of virtual physical systems and is effective in real-world applications such as molecular dynamics and human skeleton trajectory prediction. The code is available on GitHub. 4. **Uncertainty Calibration for Ensemble-Based Debiasing Methods** - **Authors:** Ruibin Xiong, Yimeng Chen, Liang Pang, Xueqi Cheng, Zhiming Ma, Yanyan Lan - **Affiliations:** Institute of Computing Technology, Chinese Academy of Sciences, Baidu, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, AIR - **Conference:** NeurIPS 2021 - **Summary:** Dataset bias in machine learning models can harm their generalization to out-of-distribution data. Ensemble-based debiasing methods (EBD) can mitigate this issue, but they often rely on inaccurate uncertainty estimates from bias models. The team proposes MoCaD, a three-stage debiasing framework that calibrates bias models, reducing catastrophic forgetting and improving performance on natural language reasoning and fact verification tasks. The method is robust to changes in dataset bias and outperforms existing EBD methods. #### Smart IoT 1. **nn-Meter: Towards Accurate Latency Prediction of Deep-Learning Model Inference on Diverse Edge Devices** - **Authors:** Li Zhang, Shihao Han, Jianyu Wei, Ningxin Zheng, Ting Cao, Yuqing Yang, Yunxin Liu - **Affiliations:** Microsoft Research Asia, Rose-Hulman Institute of Technology, University of Science and Technology of China, AIR - **Conference:** MobiSys 2021 - **Award:** Best Paper Award, Artifact Evaluation (all three highest badges) - **Summary:** Inference latency is a critical metric for running deep neural network (DNN) models on edge devices. nn-Meter, a system developed by the team, accurately predicts DNN inference latency by dividing the model into kernels and performing kernel-level predictions. The system is built on two key technologies: kernel detection and adaptive sampling. nn-Meter outperforms existing methods on three edge hardware platforms (mobile CPU, mobile GPU, and Intel VPU) and is open-sourced on GitHub. 2. **Rethinking the Representational Continuity: Towards Unsupervised Continual Learning** - **Authors:** Divyam Madaan, Jaehong Yoon, Yuanchun Li, Yunxin Liu - **Affiliations:** KAIST, AIR - **Conference:** ICLR 2022 (oral) - **Summary:** Continual learning aims to learn a sequence of tasks without forgetting previously acquired knowledge. The team focuses on unsupervised continual learning, where tasks are unlabelled. They introduce LUMP, a method that uses interpolation between current and previous tasks to reduce catastrophic forgetting. LUMP demonstrates better robustness and consistency in data representations and outperforms existing methods on CIFAR-10 and CIFAR-100 datasets. The method also shows superior performance in few-shot learning scenarios. The code is available on GitHub. 3. **Romou: Rapidly Generate High-Performance Tensor Kernels for Mobile GPUs** - **Authors:** Rendong Liang, Ting Cao, Jicheng Wen, Manyi Wang, Yang Wang, Jianhua Zou, Yunxin Liu - **Affiliations:** Microsoft Research Asia, University of California, Irvine, Xi'an Jiaotong University, AIR - **Conference:** MobiCom 2022 - **Summary:** Mobile GPUs are essential for accelerating DNN inference on edge devices. The frequent upgrades and diversity of mobile GPUs require efficient kernel generation. The team proposes Romou, a kernel compiler that leverages unique hardware capabilities and removes inefficient kernels. Romou achieves an average 14.7x speedup in convolution operations and reduces the search space by 99% compared to the best current methods. The system outperforms even the best manually optimized kernels by 1.2x. The code is open-sourced on GitHub. 4. **Brick Yourself within 3 Minutes** - **Authors:** Guyue Zhou, Liyi Luo, Hao Xu, Xinliang Zhang, Heluo Guo, Hao Zhao - **Affiliations:** AIR, McGill University, Qianzhi Technology, Peking University, Intel Labs - **Conference:** ICRA 2022 - **Summary:** The team presents a smart manufacturing system that automatically converts a portrait into a LEGO brick model. The system formulates the conversion as a constrained integer programming problem and generates assembly instructions for the user. Deployed on an integrated machine with a camera, printer, laptop, and LEGO manipulation unit, the system can produce a 150-brick model in 3 minutes. User evaluations confirm the system's effectiveness. The system operates like a smart vending machine, streamlining the manufacturing process. These highlights showcase AIR's commitment to advancing AI research and its practical applications in various fields, from autonomous driving and healthcare to edge computing and manufacturing. The institute's interdisciplinary approach and collaboration with industry partners are key to its success in producing impactful and innovative solutions.

Related Links

Related Links

Related Links

Cambridge University and Others Have Proposed a pixel-level Fundamental Model for Earth Observation Missions, Achieving state-of-the-art (SOTA) Accuracy in Multiple missions.

Cambridge University and Others Have Proposed a pixel-level Fundamental Model for Earth Observation Missions, Achieving state-of-the-art (SOTA) Accuracy in Multiple missions.

Command Palette

【Included Complete Paper】Interpretation of Recent Highlight Papers from AIR - Tsinghua University Institute for Artificial Intelligence Industry Research

Related Links

Command Palette

【Included Complete Paper】Interpretation of Recent Highlight Papers from AIR - Tsinghua University Institute for Artificial Intelligence Industry Research

Related Links

Command Palette

【Included Complete Paper】Interpretation of Recent Highlight Papers from AIR - Tsinghua University Institute for Artificial Intelligence Industry Research

Related Links

Cambridge University and Others Have Proposed a pixel-level Fundamental Model for Earth Observation Missions, Achieving state-of-the-art (SOTA) Accuracy in Multiple missions.

Cambridge University and Others Have Proposed a pixel-level Fundamental Model for Earth Observation Missions, Achieving state-of-the-art (SOTA) Accuracy in Multiple missions.