Contents

[Project] 4. Multi-Agent AI Investment Research — 16 Agents with Bull-vs-Bear Debate

One-Line Summary

16 specialized AI agents analyze a stock from every angle — macro, sector, fundamentals, technicals, sentiment, news, filings — then two agents debate bullish vs bearish cases while a judge moderates. Final buy/hold/sell recommendation backed by recomputable numbers. Supports Chinese A-shares, Hong Kong, and US equities.

System Overview

diagram

Core Design: Why 16 Agents?

Ask a single LLM “should I buy this stock?” and you get:

  • Hallucination: Made-up financial data
  • Tunnel vision: Only technical analysis, or only fundamentals
  • Unverifiable: No numeric evidence behind the claim

So we split into 16 specialized agents, each doing one thing, using real data instead of LLM-generated data:

AgentData SourceUses LLM?
Market Datayfinance APINo
Macro EnvironmentCSI 300 / Hang Seng indexNo
Sector Analysisakshare sector rankingsNo
News Collectoryfinance + DuckDuckGoNo
Announcementsakshare Caixin/EastmoneyNo
Social SentimentEastmoney forum scoresNo
Sentiment AnalysisNews from aboveYes
Fundamental AnalysisFinancial data from aboveYes
Momentum AnalysisMulti-horizon returnsNo
Quant SignalsMA/RSI/MACD/Bollinger/ATR/OBVNo (pure math)
Grid StrategyVolatility + fee modelNo
Bull DebaterAll data from aboveYes
Bear DebaterAll data from aboveYes
Debate JudgeDebate contentYes
Risk AssessmentAll data from aboveYes
AdvisoryAll data from aboveYes

Only 6 of 16 agents use LLM calls. The rest are deterministic. The LLM’s role is “analyst”, not “data source”.

Highlight 1: Bull-vs-Bear Debate Engine

Instead of one LLM saying “bullish” or “bearish”, two LLMs argue against each other:

diagram

Key constraint: Both sides must cite real data from the Quant agent (RSI, MACD, valuations) — no abstract arguments allowed. The judge checks argument quality and demands additional rounds if insufficient.

Highlight 2: Pure-Math Quant Referee

The Quant agent makes zero LLM calls — classical technical analysis computed deterministically:

1
2
3
4
Composite = MA_trend(25%) + RSI(15%) + MACD(20%) + Bollinger(10%)
            + ATR_volatility(10%) + Stochastic(10%) + OBV_volume(10%)

Output: score from -100 to +100

This score is immune to LLM hallucination. The Advisory agent applies numeric override when LLM judgment conflicts with the math:

1
2
if quant_score > 60 and llm_says == "sell":
    advisory = "Strong bullish math signal, but LLM suggests sell. Note the divergence."

Highlight 3: Grid Trading Strategy

Automatically calculates feasibility and expected returns for 4 grid variants:

StrategyGrid SpacingScenario
Short-termATR x 0.3Intraday oscillation
Medium-termATR x 0.8Weekly swing trading
Long-termATR x 1.5Monthly positioning
AccumulationATR x 2.0Bottom accumulation

Calculations include real A-share fees (stamp tax 0.05% + broker commission 0.025%), 100-share lot sizing (regulatory minimum), and monthly return estimates based on volatility-driven cycle frequency.

Highlight 4: Explainability

Every agent’s reasoning is captured in reasoning_chain. Users see the full decision path:

diagram

Every number (PEG, DCF, quant score, momentum) is recomputable from raw data — not LLM-generated, but calculated from real market data.

Data Sources: All Free, China-Accessible

SourceCoverageUsage
yfinanceGlobal (US/HK/A-shares)OHLCV, financials, news
akshareChinese A-sharesFilings, forum sentiment, sector rankings
DuckDuckGoGlobalNews search
DeepSeekLLMAnalysis reasoning (Anthropic-compatible API)

No Bloomberg, Wind, or paid terminals required.

Security Architecture

diagram
  • Input: Length limits, control character stripping, prompt injection detection
  • Output: PII redaction, system prompt leak filtering, suspicious URL blocking
  • Audit: Every security event logged with AuditKind (INPUT_BLOCKED / OUTPUT_FILTERED)
  • Token usage tracking per request for cost visibility

Deployment

PlatformMethodOne command
WindowsDouble-click run.batAuto-installs uv + Python + deps
Linuxbash deploy/install_linux.shsystemd service + daily timer
Email reportsQQ SMTPScheduled analysis to watchlist

Tech Stack

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
Language:        Python 3.11+
Agent framework: LangGraph (StateGraph, conditional edges, self-loops)
LLM:             DeepSeek (deepseek-chat) via OpenAI-compatible API
Market data:     yfinance + akshare
UI:              Streamlit
Email:           QQ SMTP_SSL (port 465)
Validation:      Pydantic v2
Package manager: uv (Rust-based)
Testing:         pytest (real API calls, no mocks)
Deployment:      systemd / Windows Task Scheduler

Project Structure

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
backend/
  agents/           # 16 agent sub-packages
    orchestrator/    # intent classification + security
    market_data/     # market data (yfinance)
    quant/           # quant signals (pure math)
    debate/          # bull vs bear debate
    debate_judge/    # debate quality control
    advisory/        # final recommendation + numeric override
    ...
  security/          # input sanitizer / PII / output filter
  observability/     # token tracker / audit trail
  graph.py           # LangGraph StateGraph builder
  state.py           # ResearchState + Pydantic models
  llm_client.py      # DeepSeek wrapper
frontend/
  app.py             # Streamlit chat UI
deploy/
  install_linux.sh   # one-click Linux deployment
scripts/
  run.bat            # zero-dependency Windows launcher
  scheduled_analysis.py  # scheduled task entry

One-Paragraph Summary

A 16-agent investment research system. Six parallel collectors fetch real market data (no LLM-generated data), a pure-math Quant agent provides hallucination-immune signals, a Bull-vs-Bear debate engine forces two LLMs to argue with evidence while a judge controls quality, and the Advisory agent synthesizes all dimensions into a recommendation — with automatic numeric override when LLM judgment conflicts with math signals. The entire chain is explainable: every number is recomputable from raw data, and every decision step is traceable.