Methodology

How portfolios are built, tracked, and evaluated

Overview

LLM Portfolio Lab tracks hypothetical equity portfolios constructed from the publicly disclosed stock picks of major large language models. Each portfolio is evaluated using standard quantitative finance methods including daily NAV tracking and Fama-French factor regression.

Portfolio Construction

Each portfolio represents a set of ETF and stock allocations attributed to a specific LLM (e.g. ChatGPT, Claude, Gemini).

Daily NAV is computed as:

NAV(t) = Σ [ w_i × ( P_i(t) / P_i(t₀) ) ]

w_i—Portfolio weight of asset i (weights sum to 1)

P_i(t)—Adjusted closing price of asset i on day t

P_i(t₀)—Adjusted closing price of asset i on the inception date t₀

All portfolios start at a base value of $100. The first day return is set to zero so the NAV begins exactly at the base, and each subsequent day reflects the proportional price change of each holding from inception.

Data Sources & Update Frequency

Source	Data	Refresh
Yahoo Finance (yfinance)	Daily OHLCV price data	Daily on request
Ken French Data Library	FF5 daily factor returns	On /api/ff5/sync
SQLite (local)	NAV series, FF5 factors, regression results	Persistent cache

Note: The Ken French Data Library updates monthly. Factor data is downloaded from the official zip file at mba.tuck.dartmouth.edu/pages/faculty/ken.french/Data_Library

Fama-French 5-Factor Regression

Each portfolio's excess return is regressed against the five Fama-French factors to decompose performance into systematic risk exposures and idiosyncratic alpha.

R_p(t) − RF(t) = α + β₁·MKT(t) + β₂·SMB(t) + β₃·HML(t)
+ β₄·RMW(t) + β₅·CMA(t) + ε(t)

R_p(t)—Portfolio daily return

RF(t)—Risk-free rate (1-month T-bill, from Ken French data)

MKT—Market excess return (Mkt-RF)

SMB—Small Minus Big — size factor

HML—High Minus Low — value factor

RMW—Robust Minus Weak — profitability factor

CMA—Conservative Minus Aggressive — investment factor

α—Excess return not explained by factors (Jensen's alpha)

Implementation notes:

Regression is run using statsmodels OLS with a constant term
Minimum 30 trading days of overlapping data required
Results stored in the ff5_regressions table in SQLite
Cross-validated against numpy.linalg.lstsq; max coefficient difference < 1×10⁻¹⁰

Radar Chart Score Methodology

Raw factor loadings are normalized to a 0–100 scale for visualization in the radar chart on the Outlook page. Each factor's expected range is mapped linearly to [0, 100] then clamped.

Axis	Source	Raw Range	Formula
Market Beta	β_mkt	0.5 – 1.5	(β − 0.5) / 1.0 × 100
Size Tilt (SMB)	β_smb	−0.5 – 0.5	(β + 0.5) / 1.0 × 100
Value Tilt (HML)	β_hml	−0.5 – 0.5	(β + 0.5) / 1.0 × 100
Profitability (RMW)	β_rmw	−0.5 – 0.5	(β + 0.5) / 1.0 × 100
Investment (CMA)	β_cma	−0.5 – 0.5	(β + 0.5) / 1.0 × 100
Alpha (annualized)	α × 252	−5% – +5%	(α_ann + 0.05) / 0.10 × 100

Note: All scores are clamped to [0, 100]. A score of 50 represents a neutral loading. Scores above 50 indicate positive factor exposure.