Data Methodology
Complete disclosure of data sources, formulas, cleaning processes, and AI agent architecture
All financial data is sourced from public, freely available sources. No proprietary or paywalled data feeds are used. Every data point can be traced back to its original source filing or API.
Primary Data Sources
| Source | Data Type | Update Frequency | Method |
|---|---|---|---|
| SEC EDGAR | XBRL Financial Facts (10-K, 10-Q) | Hourly round-robin (200 cos/batch) | REST API |
| SEC EDGAR | SEC Filings (all form types) | Hourly round-robin (500 cos/batch) | REST API |
| SEC EDGAR | Insider Transactions (Form 4) | Hourly round-robin (300 cos/batch) | REST API |
| SEC EDGAR | Institutional Holdings (13-F) | Quarterly + parallel per-filer dispatch | REST API + XML |
| US Treasury | Daily Yield Curve (13 maturities) | Every 4 hours | CSV download |
| Finnhub | Real-time Stock Prices | Every 15 minutes (~400 tickers/cycle) | REST API (free tier) |
| RSS Feeds | Financial News (15 feeds) | Every 15 minutes | feedparser (CNBC, MarketWatch, Reuters, etc.) |
Batch Processing & Scheduling
Data syncs use round-robin batch processing to cover the entire 8,000+ company universe without overwhelming SEC rate limits (10 req/s). Each hourly cycle processes a different batch, completing full coverage within 24-72 hours depending on data type. Stock prices prioritize companies with active agent portfolio positions, then rotate through the top 1,500 companies.
SEC EDGAR Rate Compliance
All SEC EDGAR requests include a User-Agent header per SEC guidelines. Request rate is limited to 2.0 requests/second per worker x 4 concurrent workers = 8 total (under SEC's 10 req/s limit).