vLLM+Open WebUI Deployment Seed-OSS-36B-Instruct
1. Tutorial Introduction

Seed-OSS-36B-Instruct is an open-source large language model released by the ByteDance Seed team in August 2025. Seed-OSS was trained on 12 trillion (12 T) tokens and achieved outstanding performance on multiple mainstream open-source benchmarks. The Seed-OSS-36B architecture combines several common design choices, including causal language modeling, grouped query attention, SwiGLU activation function, RMSNorm, and RoPE positional encoding. One of its most representative features is its native long-context capability, with a maximum context length of 512k tokens, enabling it to handle extremely long documents and reasoning chains without sacrificing performance. This length is twice that of OpenAI's latest GPT-5 model series, equivalent to approximately 1,600 pages of text.
The computing resources used in this tutorial are dual-card RTX A6000.
2. Effect display

3. Operation steps
1. Start the container

2. Usage steps
If "Model" is not displayed, it means the model is initializing. Since the model is large, please wait about 4-5 minutes and refresh the page.

4. Discussion
🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓

Citation Information
The citation information for this project is as follows:
@misc{seed2025seed-oss,
author={ByteDance Seed Team},
title={Seed-OSS Open-Source Models},
year={2025},
howpublished={\url{https://github.com/ByteDance-Seed/seed-oss}}
}