Command Palette
Search for a command to run...
QuantaAlpha: Ein evolutionärer Rahmen für die LLM-getriebene Alpha-Entdeckung
QuantaAlpha: Ein evolutionärer Rahmen für die LLM-getriebene Alpha-Entdeckung
Zusammenfassung
Finanzmärkte sind geräuschbehaftet und nicht-stationär, was die Alpha-Generierung besonders empfindlich gegenüber Rauschen in Backtesting-Ergebnissen und plötzlichen Markttypverschiebungen macht. Obwohl neuere agente-basierte Rahmenwerke die Automatisierung der Alpha-Generierung verbessern, fehlen ihnen oft kontrollierbare mehrstufige Suchprozesse und zuverlässige Wiederverwendung validierter Erfahrungen. Um diese Herausforderungen anzugehen, schlagen wir QuantaAlpha vor – ein evolutionäres Framework zur Alpha-Generierung, das jeden end-to-end-Mining-Lauf als Trajektorie behandelt und Faktoren durch trajectorienbasierte Mutation- und Crossing-Operationen verbessert. QuantaAlpha identifiziert suboptimale Schritte innerhalb jeder Trajektorie gezielt zur Überarbeitung und kombiniert komplementäre, hohe Renditen erzielende Segmente, um wirksame Muster wiederzuverwenden, wodurch strukturierte Exploration und Verfeinerung über mehrere Mining-Iterationen hinweg ermöglicht werden. Bei der Faktorgenerierung stellt QuantaAlpha semantische Konsistenz zwischen Hypothese, Faktorausdruck und ausführbarem Code sicher und begrenzt gleichzeitig die Komplexität und Redundanz der generierten Faktoren, um das Phänomen des „Crowding“ zu mindern. Umfassende Experimente am China Securities Index 300 (CSI 300) zeigen konsistente Verbesserungen gegenüber starken Baseline-Modellen und früheren agente-basierten Systemen. Bei Nutzung von GPT-5.2 erreicht QuantaAlpha einen Information Coefficient (IC) von 0,1501, eine annualisierte Rendite (ARR) von 27,75 % und ein Maximum Drawdown (MDD) von 7,98 %. Darüber hinaus übertragen sich die von QuantaAlpha auf dem CSI 300 generierten Faktoren effektiv auf den China Securities Index 500 (CSI 500) und den Standard & Poor’s 500 Index (S&P 500), wobei jeweils über vier Jahre eine kumulierte Überschussrendite von 160 % bzw. 137 % erzielt wird – ein Hinweis auf eine starke Robustheit von QuantaAlpha gegenüber Veränderungen der Marktdistribution.
One-sentence Summary
Researchers from SUFE, QuantaAlpha, Stanford, PKU, SYSU, and SEU propose QuantaAlpha, an evolutionary framework that refines financial alpha factors via trajectory-level mutation and crossover, ensuring semantic consistency and reducing redundancy, achieving strong out-of-sample performance on CSI 300, CSI 500, and S&P 500.
Key Contributions
- QuantaAlpha introduces an evolutionary framework for alpha mining that treats each mining run as a trajectory, enabling targeted refinement via mutation and crossover to overcome noise sensitivity and improve controllability in non-stationary markets.
- The system enforces semantic consistency and complexity constraints during factor generation, while reusing high-reward trajectory segments to mitigate crowding and support reliable, auditable knowledge transfer across iterations.
- Evaluated on CSI 300, QuantaAlpha achieves an IC of 0.1501 and 27.75% ARR with 7.98% MDD, and demonstrates strong out-of-distribution robustness by delivering 160% and 137% cumulative excess returns on CSI 500 and S&P 500 over four years.
Introduction
The authors leverage large language models to automate alpha factor discovery in financial markets, where noise and non-stationarity make traditional methods brittle and prone to overfitting. Prior agentic frameworks improve automation but suffer from fragile controllability due to noisy feedback, limited reuse of validated insights, and narrow exploration that leads to factor crowding. QuantaAlpha addresses this by treating each mining run as an evolvable trajectory, applying mutation to fix suboptimal steps and crossover to recombine high-performing segments—enabling structured, traceable refinement. It also enforces semantic consistency and complexity constraints during generation to prevent drift and redundancy. Evaluated on CSI 300, it outperforms baselines with strong transferability to CSI 500 and S&P 500, demonstrating robustness under market shifts.
Dataset

- The authors use the CSI 300 dataset, covering 300 large-cap A-share stocks in China, with a chronological split: training (2016–2020), validation (2021), and testing (2022–2025).
- Backtesting extends to CSI 500 and S&P 500 indices using the Qlib framework, with data splits detailed in Table 5.
- Factor construction relies on six basic price and volume features (open, high, low, close, volume, vwap) to predict next-day returns, calculated as y_t = P_{t+2}^close / P_{t+1}^close - 1.
- Preprocessing includes forward-filling missing values, replacing infinities, dropping samples with missing labels, and applying cross-sectional rank normalization (CSRrankNorm) to features and labels.
- Model evaluation uses two sets of metrics: factor predictive power (IC, ICIR, Rank IC, Rank ICIR) and strategy performance (ARR, IR, MDD, CR).
- Baselines include traditional ML, deep learning time-series models, classical factor libraries, and LLM-based agents like RD-Agent and AlphaAgent.
Method
The authors leverage a multi-agent, hypothesis-driven framework called QuantaAlpha to systematically construct and evolve alpha factors for quantitative trading. Rather than treating alpha mining as a static, one-shot model fitting task, they frame it as an iterative, agentic research workflow that generates and refines mining trajectories—ordered sequences of states and actions—from initial context to final evaluated factor. The core architecture is structured around four components: diversified planning initialization, factor realization with constraint gating, self-evolution via mutation and crossover, and a final factor pool that consolidates validated outputs.
Refer to the framework diagram, which contrasts QuantaAlpha with traditional machine learning and agent-based baselines. The system begins with a seed factor pool, from which an initialization agent generates a diversified set of market hypotheses. These hypotheses are then instantiated into executable factors through a symbolic intermediate representation, ensuring semantic fidelity and structural control. Each factor undergoes backtesting and is evaluated for predictive performance and regularization penalties. The resulting trajectories are then subjected to evolutionary operators—mutation and crossover—that iteratively refine the search space by revising suboptimal decisions or recombining high-performing segments from parent trajectories.

The factor realization module is central to maintaining controllability and interpretability. Given a hypothesis h, the factor agent maps it to a structured semantic description d, which formalizes the intended mechanism using a standardized operator library O. This description is then assembled into a symbolic expression f, parsed into an Abstract Syntax Tree (AST) T(f), and compiled into executable code c. Leaf nodes in the AST bind to raw features (e.g., high, volume), while internal nodes correspond to operators such as TS_MIN, SMA, or RANK, making the computational graph transparent. To ensure fidelity, an LLM-based verifier checks alignment between the hypothesis, semantic description, and symbolic expression, as well as between the symbolic form and generated code. If inconsistencies are detected, the system regenerates or repairs the offending component.
To promote parsimony and novelty, the authors impose explicit structural constraints. Complexity is quantified as C(f)=α1⋅SL(f)+α2⋅PC(f)+α3⋅log(1+∣Ff∣), where SL(f) is symbolic length, PC(f) counts free parameters, and Ff is the set of raw features used. Redundancy is measured via AST isomorphism: for a candidate factor f and an existing alpha zoo Z, the maximum structural similarity is computed as S(f)=maxϕ∈Zs(f,ϕ), where s(f,ϕ) is the size of the largest common isomorphic subtree. Factors violating complexity or redundancy thresholds are rejected and rewritten.

The self-evolution phase drives iterative improvement. Mutation targets a suboptimal decision node k in a trajectory τ and rewrites only the localized action ak, preserving the prefix up to sk and regenerating subsequent steps to maintain coherence. This allows for mechanism-level refinements such as altering time scales or adding regime conditions. Crossover synthesizes a new child trajectory by combining high-performing segments from multiple parent trajectories, explicitly inheriting validated decisions. For example, one parent may contribute a hypothesis template for retail-driven momentum, while another contributes a structural pattern for institutional validation; the crossover operator merges these into a unified, regime-aware dual-source factor.

The evolutionary process is demonstrated in a case study where a factor named Institutional_Momentum_Score_20D emerges from a crossover operation combining insights from two parent trajectories: one focused on fragile retail momentum and the other on sustainable institutional momentum. The synthesized hypothesis introduces dynamic weighting by market volatility, amplifying institutional signals in stable regimes and retail reversal signals in turbulent ones. The resulting factor expression, IMS20D=RANK(ρ20(PΔP,VΔV)×(CC−O)5), captures institutional-driven momentum through price-volume correlation and intraday return patterns, with cross-sectional ranking ensuring comparability.

The lineage of this factor is traceable: it originates from Parent 1, which identified unsustainable retail momentum, and Parent 2, which validated institutional structural trends. The crossover operation explicitly recombines these validated segments, producing an offspring with improved Rank IC (0.0311) over both parents (0.0216 and 0.0246). This demonstrates how the framework enables not just performance improvement but also conceptual synthesis, preserving the core market hypotheses while enhancing predictive power through structured evolution.

Experiment
- QuantaAlpha outperforms all baselines in predictive power and strategy performance on CSI 300, demonstrating robustness across market regimes and real-world viability under standard risk controls.
- Evolutionary components—diversified initialization, mutation, and crossover—collectively enhance exploration, repair, and reuse of high-performing factor trajectories, with mutation being critical for escaping local optima.
- Semantic consistency, complexity control, and redundancy filtering during factor generation are essential for stable, generalizable factor discovery; removing any degrades performance, especially at the strategy level.
- QuantaAlpha exhibits strong out-of-distribution generalization, sustaining performance on CSI 500 and S&P 500 without retraining, unlike baselines that fail under market regime shifts.
- During the 2023 market transition to small-cap and thematic stocks, QuantaAlpha maintains predictive power by discovering structural factors tied to overnight gaps, volatility clustering, and trend quality—aligning with evolving microstructure.
- Factor diversity through semantic mutation allows QuantaAlpha to adapt to regime changes, avoiding concentration on outdated market hypotheses and mitigating alpha decay.
- Iterative evolution improves factor quality efficiently, with performance stabilizing around 11–12 iterations; beyond this, diminishing returns and redundancy degrade risk-adjusted performance.
- Crossover operations enhance predictive accuracy but may increase drawdown, indicating a trade-off that requires regime-adaptive weighting for optimal risk-return balance.
The authors use an ablation study to isolate the contributions of planning, mutation, and crossover in their evolutionary factor mining framework. Results show that removing mutation causes the largest drop in predictive power and strategy returns, while removing planning primarily degrades risk-adjusted performance, and removing crossover leads to moderate but consistent declines. This confirms that all three components are essential, with mutation driving exploration, planning stabilizing search, and crossover enabling efficient reuse of successful patterns.

The authors use a factor evaluation agent to assess predictive power and strategy performance, revealing that QuantaAlpha maintains higher coverage and a greater proportion of factors with positive and statistically meaningful Rank IC compared to AlphaAgent. Results show QuantaAlpha’s factors exhibit stronger overall predictive consistency and a heavier right tail in performance distribution, indicating more robust and diverse signal generation under market shifts. This suggests the system’s evolutionary design and semantic controls help sustain factor quality and generalizability beyond specific market regimes.

The authors use a structured evaluation to compare factor performance across different semantic categories, revealing that QuantaAlpha’s factors excel in capturing overnight market dynamics, trend quality, and liquidity signals, while underperforming factors often rely on rigid or noise-sensitive mechanisms. Results show that strong performers align with persistent microstructure effects like volatility clustering and auction-driven price discovery, whereas weak ones degrade under regime shifts due to overfitting or lack of adaptive conditioning. This pattern confirms that robust factor design requires semantic alignment with market structure and diversity across information channels, not just statistical fit.

The authors use a crossover operation to combine factor trajectories, resulting in an offspring factor that improves predictive power and annualized excess return over the baseline. However, this gain comes with increased maximum drawdown, indicating higher risk exposure during volatile market conditions. The results suggest that while combining signals enhances returns, it requires additional regime-adaptive controls to maintain risk-adjusted performance.

The authors use QuantaAlpha to generate and evolve trading factors through a trajectory-based evolutionary framework, achieving superior predictive power and strategy performance across multiple large language models. Results show that QuantaAlpha consistently outperforms both traditional machine learning models and prior LLM-based agents, particularly in maintaining high returns with controlled drawdowns under real-world trading constraints. The system’s gains stem from structured factor generation, semantic consistency controls, and evolutionary mechanisms that enhance exploration and reuse of successful patterns.
