Docling: Document Parsing Tool
1. Tutorial Introduction

Docling It is an open source multi-functional document conversion tool launched by IBM in 2024, which aims to simplify and automate the process of document format conversion. It supports converting multiple common file formats (such as PDF, Word, PPTX, Markdown, etc.) into multiple different output formats, such as text (Text), Markdown, Doctags, JSON and YAML.
Docling adopts a modular design for document conversion and processing, and different conversion modes can be replaced as needed to meet different requirements.
Key features:
- Supports conversion of multiple document formats to Text , Markdown , Doctags , JSON , YAML Format.
- Supports multiple input formats, including PDF, DOCX, PPTX, MD, ASCIIDOC, etc.
- It provides a clear and concise interface for easy integration with other applications.
- Supports building a visual interface through Gradio, allowing users to perform interactive file upload and conversion operations.
Supported file formats:
- PDF: Can be converted to Text, Markdown, Doctags, JSON and YAML formats.
- DOCX: Can be converted to Text, Markdown, Doctags, JSON and YAML formats.
- PPTX: Can be converted to Text, Markdown, Doctags, JSON and YAML formats.
- Markdown: Can be converted to Text, Markdown, Doctags, JSON and YAML formats.
- ASCIIDOC: Can be converted to JSON and YAML formats.
2. Operation steps
1. Start the container
通过 API 地址进入 Web 界面

2. File conversion
进入 web 界面后,按照以下步骤进行操作:

3. Exchange and Discussion
🖌️ If you find a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome everyone to scan the QR code to join the group, note [SD Tutorial], discuss technical issues with everyone, and share application results!
