HyperAIHyperAI

Command Palette

Search for a command to run...

Triton Compiler Tutorial

An error occurred in the Server Components render. The specific message is omitted in production builds to avoid leaking sensitive details. A digest property is included on this error instance which may provide additional details about the nature of the error.

Failed to load notebook details

Introduction

Triton is a language and compiler for parallel programming, designed to provide a Python-based programming environment for efficiently writing custom DNN computation kernels that can run at maximum throughput on GPU hardware.

This project is a complete Triton learning tutorial, covering all aspects from basic to advanced, including vector operations, matrix operations, layer normalization, attention mechanisms, and FP8 matrix multiplication.

Table of contents

1. Basic Operation Tutorial

1.1 Vector Addition

  • 01-vector-add.cn.ipynb – An introductory tutorial to vector addition, introducing the basic Triton programming model.

2. Core Operator Tutorial

2.1 Fused Softmax

2.2 Matrix Multiplication

2.3 Layer Normalization

3. Advanced Features Tutorial

3.1 Low-Memory Dropout

3.2 Fused Attention

3.3 Libdevice External Functions

3.4 Grouped GEMM

3.5 Continuous FP8 Matrix Multiplication

3.6 Block Scaling Matrix Multiplication

Reference Resources

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp