Command Palette
Search for a command to run...
COLLEAGUE.SKILL:専門知識蒸留による自動AIスキル生成
COLLEAGUE.SKILL:専門知識蒸留による自動AIスキル生成
Tianyi Zhou Dongrui Liu Leitao Yuan Jing Shao Xia Hu
概要
LLM agentsは、単に孤立したタスクを完了させるだけでなく、人間の専門知識、判断、対話スタイルの境界付き表現を担うことが、ますます期待されるようになっている。個人や役割に関連する実行可能な知識は通常、明確な指示として記述されるのではなく、異種混合の痕跡に埋め込まれているため、このようなperson-groundedなagentsの構築は依然として困難である。既存のmemoryおよびpersonaシステムはこの証拠の断片を捉えるものの、skillフレームワークはポータブルなパッケージング形式を提供するに留まり、これらの痕跡を検査可能かつ修正可能で、agentが利用可能なskillに蒸留するためのエンドツーエンドのワークフローは存在しない。本稿では、専門知識の蒸留を通じてperson-groundedなAI skillを生成するための、自動化されたtrace-to-skill蒸留システムを提示する。対象となる人物や役割の資料を入力として、COLLEAGUE.SKILLは2つの協調されたトラックからなるバージョン管理付きのskillパッケージを生成する。1つは実践、メンタルモデル、意思決定ヒューリスティクスを対象としたcapability trackであり、もう1つはコミュニケーションスタイル、対話ルール、修正履歴を対象としたbounded behavior trackである。本パッケージは検査や呼び出しが可能であり、自然言語によるフィードバックで更新したり、ロールバックしたり、複数のagentホストにインストールしたりできる。さらに、必要に応じて制御された配布用に準備することも可能である。本稿では、オープンソースシステムに実装されたartifact contract、生成ワークフロー、修正ライフサイクル、デプロイメントサーフェス、およびドメインプリセットについて記述する。執筆時点において、公開リポジトリはGitHub上で約18.5kのstarsを記録しており、ギャラリーには165名のcontributorによる215のskillsが登録されている。また、登録されたskillカード全体のcumulative starsは10万を超えている。本システムは、person-groundedなskillsを不透明なpromptsや隠蔽されたmemoriesではなく、ポータブルかつ修正可能なpackagesとして表現する方法を示している。
One-sentence Summary
The authors propose COLLEAGUE.SKILL, an automated expert knowledge distillation system that converts heterogeneous traces from a target person into a versioned skill package containing a capability track for decision heuristics and a bounded behavior track for interaction rules, enabling the package to be inspected, updated through natural-language feedback, and installed across agent hosts.
Key Contributions
- The paper introduces COLLEAGUE.SKILL, an automated trace-to-skill distillation system that converts heterogeneous human interaction traces into portable, person-grounded AI skills.
- The method generates a versioned skill package with two coordinated tracks that explicitly separate operational capabilities (practices, mental models, and decision heuristics) from bounded behavioral constraints (communication style, interaction rules, and correction history).
- The system provides a transparent workflow that supports package inspection, natural-language feedback, state rollback, and installation across agent hosts to ensure distilled skills remain editable, portable, and accountable across deployment environments.
Introduction
As LLM agents evolve from executing isolated instructions to carrying reusable expertise, the field is rapidly adopting modular skill architectures that package domain knowledge and interaction patterns for on-demand deployment. However, transforming unstructured human traces, such as chat logs, documents, and public records, into structured agent capabilities remains a significant hurdle. Prior memory and persona systems typically fragment this evidence or rely on opaque prompts that lack provenance, correction mechanisms, and clear usage boundaries. The authors leverage an automated distillation pipeline to bridge this gap with COLLEAGUE.SKILL, a system that converts heterogeneous human traces into versioned, portable skill packages. By explicitly separating operational capabilities from bounded behavioral constraints, the framework enables users to inspect, correct, rollback, and deploy person-grounded skills across multiple agent hosts while maintaining full transparency over source material and distribution limits.
Dataset
- Dataset Composition and Sources: The authors construct a person-grounded skill ecosystem centered on public figures, drawing primarily from first-person works, long-form interviews, documented decisions, and clearly marked inferences. The corpus is further expanded through community contributions to the COLLEAGUE.SKILL platform.
- Subset Details: While exact subset sizes are not specified, the data is organized around a dedicated celebrity preset and modular community skills. Each entry is filtered to prioritize substantive, long-form material while explicitly excluding short summaries and low-quality content aggregators.
- Data Usage and Training: The authors leverage this curated dataset for public-source expert distillation. During inference, the system tracks evidence availability and automatically downgrades confidence scores when source material is sparse, preventing the model from filling gaps with generic persona text. The pipeline supports iterative creation, versioned corrections, and transparent public distribution.
- Processing and Metadata Construction: The preprocessing workflow automates subtitle extraction, audio transcription, and text cleanup before merging research notes. A dedicated quality checker evaluates each artifact for mental-model coverage, stylistic patterns, internal contradictions, grounding URLs, and copyright compliance. All evidence constraints and quality metrics are packaged as explicit metadata that accompanies the final distributed outputs.
Method
The authors leverage a structured pipeline for person-grounded skill distillation, designed to transform heterogeneous human traces into a portable, inspectable, and governable skill artifact. The core framework, referred to as COLLEAGUE.SKILL, operates through a sequence of well-defined stages: trace intake, preset routing, dual distillation, artifact writing, and productization. As shown in the figure below, the process begins with trace intake, where raw materials such as work documents, chat logs, review comments, or public interviews are collected and stored locally. These inputs are then routed through a preset router, which selects the appropriate domain-specific configuration—such as colleague, celebrity, or relationship—based on the source type and governance assumptions. The selected preset defines the evidence scope, access controls, and invocation semantics. 
Following routing, the dual distillation stage separates the skill into two distinct tracks: a capability track and a persona track. The capability track captures durable mental models, procedural judgment, and technical standards derived from the source material, while the persona track encodes bounded behavior constraints, interaction rules, and expression preferences. This separation ensures that the generated skill maintains a clear distinction between expert judgment and surface behavior, enabling separate invocation of full, capability-only, or persona-only entrypoints. The dual representation is central to the system's design, preventing conflation of factual knowledge with interaction style and supporting composability and correctness. 
The distilled tracks are then processed by the artifact writer, which normalizes metadata into a versioned schema and renders the final skill package. This package includes a primary SKILL.md file that combines both tracks, alongside separate work.md and persona.md source documents, independently invokable sub-skills, and metadata files such as manifest.json and meta.json. The writer ensures alignment with the Agent Skills standard, where SKILL.md serves as the required entrypoint and optional files provide scripts or references. The resulting artifact is a self-contained, versioned package that can be installed into supported agent hosts, shared via an optional gallery, or modified through correction records. 
The system supports a comprehensive lifecycle for skill management, including correction, rollback, and optional distribution. Corrections are processed through a handler that interprets natural-language feedback and generates either a Markdown patch for capability updates or a structured correction record for behavior adjustments. These changes trigger a regeneration of the artifact, incrementing the lifecycle version and preserving the prior state. The version manager enables listing, backing up, rolling back, and cleaning archives, ensuring that the artifact remains auditable and correctable over time. This lifecycle is supported by a governance rail that enforces local-first storage, provenance tracking, and user-owned correction logs, making the entire process transparent and accountable.