HyperAI

4 meetups, 3 cities, 19 guests, 1k+ industry practitioners, 1 million+ exposures in total,In 2023, the AI compiler community slightly refreshed its presence. We found the most vertical developers and engineers in very niche fields, established small bases from 0 to 1, built communication platforms, promoted internal cooperation, and connected the upstream and downstream of the ecosystem.

Although 2024 is already halfway through, big models continue to occupy the "hot search list" in the technology circle.We will be at the Institute of Computing Technology, Chinese Academy of Sciences on July 6 (Saturday).Held the 5th offline gathering of the Meet AI Compiler Technology Salon.

This meetup is divided into two parts: technology sharing and roundtable discussion. The guests are from Shanghai Jiao Tong University, Institute of Computing Technology, Chinese Academy of Sciences, Microsoft Research Asia, and Beijing Zhiyuan Artificial Intelligence Research Institute. We hope that the new gathering can bring new technical gains to everyone and make new friends in the same industry~

Event Details

⏰ Time: July 6 (Saturday) 13:30-18:00

Location: Lecture Hall, 1st Floor, Institute of Computing Technology, Chinese Academy of Sciences, No. 6, Kexueyuan South Road, Haidian District, Beijing

Number of people: 200 (Onsite seats are limited, please register as early as possible)

Registration: Scan the QR code below to register

Scan the QR code and remark "AI Compiler" to join the event group:

Agenda:

Guests and Agenda

Session 1 Sharing guests

Share topic:MLCEngine: A Universal LLM Deployment Engine

Content introduction: This sharing will introduce MLCEngine, an LLM engine that can be universally deployed on different platforms. MLCEngine not only has high-throughput, low-latency LLM serving capabilities on the server, but also supports seamless deployment of today's high-quality large language models in various local environments.

Watch this sharing session and you will learn:

1. Design concept and usage of MLCEngine

2. The significance of Universal Deployment

3. Thoughts on the development of LLM reasoning engine

Share topic:ElasticRoom: Multi-Tenant DNN Inference Engine via Co-design with Resource-constrained Compilation and Strong Priority Scheduling

Contents:GPU resource partitioning mechanisms in runtime software have been widely used in job schedulers and multi-tenant computing systems to improve resource utilization and throughput. However, existing GPU resource partitioning mechanisms cannot simultaneously improve GPU resource utilization and ensure low latency for real-time requests when facing batch heterogeneous DNN inference requests.We propose an innovative multi-tenant DNN inference engine, ElasticRoom, which builds resource-constrained compilation based on TVM and achieves both high GPU utilization and low latency for real-time requests through priority scheduling.

Watch this sharing session and you will learn:

1. GPU resource management and task scheduling

2. Resource-constrained compilation based on TVM

Share topic:Efficient deep learning compilation system based on tile abstraction

Contents:With the rapid development of deep learning algorithms and hardware, the industry has higher requirements for efficient and fast model deployment. Deep learning compilers have become a new way to connect model computational expression and underlying hardware execution. However, there are still many challenges in efficiently supporting the rapid development of deep learning applications on different hardware.This sharing will introduce our series of exploratory work in the field of deep learning compilation based on a unified tile abstraction.

Watch this sharing session and you will learn:
1. Deep Learning Compilation Stack Based on Tile Abstraction

2. In deep learning application scenarios, how to optimize global memory access efficiency through tile abstraction

3. In deep learning application scenarios, how to support low-precision deep learning calculations through tile abstraction

Share topic:FlagGems, a Triton-based large model operator library, innovative practice

Content introduction: Based on OpenAI Based on the Triton language, we have developed a high-performance general operator library FlagGems to provide reasoning and training acceleration for large models under the PyTorch framework.In view of the programming characteristics of Triton, we applied two technical innovations: runtime optimization and automatic code generation, which expanded the expressive power of operators and improved their performance.

Watch this sharing session and you will learn:

1. Learn about Triton programming language and get in touch with Triton open source ecosystem

2. Understand the FlagGems operator library and its development progress

3. Understand the runtime optimization technology and automatic code generation technology used in FlagGems

Session 2: Roundtable Session

Roundtable topics:Compilation Optimization Across Heterogeneous Chips in the Transformer Era

Organizers and partners

HyperAI is a leading artificial intelligence and high-performance computing community in China.The goal is to help developers and enthusiasts in China's data science and artificial intelligence industry learn, understand, and practice by providing a variety of infrastructure such as accelerated data set downloads, online tutorial demonstrations, in-depth interpretation of papers, and top conference calendar integration, and to build the future of artificial intelligence with the community. Currently, the HyperAI official website has launched thousands of classic and high-quality public data sets and tutorials, and operates the most active AI compiler community in China. HyperAI is also the only organizer of this series of activities.

Visit the official website:https://hyper.ai/

OpenBayes is a leading high-performance computing service provider in China.By grafting classic software ecosystems and machine learning models onto the new generation of heterogeneous chips, it provides industrial enterprises and university scientific research with faster and easier-to-use data science computing products. Its products have been adopted by dozens of large industrial scenarios or leading scientific research institutes.

Visit the official website:https://openbayes.com/

The MLC.AI community was established in June 2022. Chen Tianqi, the main inventor of Apache TVM and a well-known young scholar in the field of machine learning, led the team to launch the MLC online course, which systematically introduced the key elements and core concepts of machine learning compilation.

In November 2022, with the joint efforts of MLC.AI community volunteers, the first complete TVM Chinese documentation was launched and successfully hosted on the HyperAI official website, further providing domestic developers interested in machine learning compilation with the basic settings for accessing and learning a new technology - documentation.

MLC Online Courses:https://mlc.ai/

TVM Chinese Documentation:https://tvm.hyper.ai/

The Institute of Computing Technology of the Chinese Academy of Sciences (ICT) was founded in 1956 and is the first academic institution in China dedicated to comprehensive research in computer science and technology.The Institute of Computing Technology successfully developed my country's first general-purpose digital electronic computer and formed a research and development base for my country's high-performance computers. my country's first general-purpose CPU chip was also born here.

The Institute of Computing Technology is the cradle of my country's computer industry. With the development of the Institute of Computing Technology, it has trained hundreds of my country's earliest computing technology professionals for the country, and more than 20 academicians have worked or studied here. With the development of disciplines and technologies, several research institutes such as the Xi'an Microelectronics Institute, the Computing Center, the Software Institute, the Network Center, the Microelectronics Institute, and the Information Engineering Institute have been separated from the Institute of Computing Technology, and high-tech companies such as Lenovo, Sugon, Loongson, and Cambrian have been incubated.

The Technical Committee of HPC (China Computer Federation, abbreviated as CCF TCHPC) was established in 2005 with the approval of the China Computer Federation. As a professional committee under the China Computer Federation, it is an authoritative organization for academic research on high-performance computing, organizing academic conferences in the field of high-performance computing, and providing industry-university application services.

Based on the principle and mission of "building an academic platform, promoting industrial exchanges, advancing application implementation, balancing the software and hardware ecosystem, serving industry development, and connecting industry, academia, research and application", we are committed to promoting the research and development of China's high-performance computing field and building a high-performance computing academic and industrial cooperation and exchange platform. It plays an irreplaceable and important role and significance in supporting scientific and technological development and innovation, promoting social progress, and enhancing my country's comprehensive national strength and international competitiveness.

In June 2011, the Chinese Academy of Sciences officially established the Youth Innovation Promotion Association (hereinafter referred to as the "Youth Promotion Association"), which is an innovative initiative of the Chinese Academy of Sciences to carry out comprehensive training for young scientific and technological talents under the age of 35 in the academy. It aims to unite and consolidate young scientific and technological workers in the academy through effective organization and support, broaden their academic horizons, promote mutual exchanges and interdisciplinary studies, enhance the ability to organize scientific research activities, and cultivate a new generation of academic and technical leaders.

Event Support

Active row:Scan the QR code to jump to the event registration

Scan the QR code and remark "2024 AI Compiler" to join the event group

Taking into account the venue space conditions of this event, we have only opened 200 places for attendance. We recommend that you register as early as possible to secure a seat.

July 6th 13:30-17:40, looking forward to meeting new and old friends!

Command Palette

AI Compiler Technology Sharing Session: Shanghai Jiao Tong University/Institute of Computing Technology, Chinese Academy of Sciences/Microsoft Asia Research/Zhiyuan, They Are Here!

Guests and Agenda

Organizers and partners