The pipeline behind every report
When you type a company name, multiple data sources are fetched in parallel and an AI synthesis pass produces the final report. Roughly 30–60 seconds later, you have a structured report with inline citations. Here's exactly what runs.
Architecture overview
Sentinellis is built on Temporal workflows — a durable execution engine. Every report is a workflow with automatic retry policies, timeouts, and failure recovery. If a data source is briefly down, Temporal retries with exponential backoff. If it's permanently unavailable, the workflow completes with a partial result and a confidence score reflecting what we couldn't fetch.
The orchestrator (AnalyzeCompanyWorkflow) launches the data fetches in parallel, waits for all of them (with bounded timeouts), then hands the combined output to the AI synthesis step.
Pipeline stages
Profile & financials
Fundamentals + 100+ derived metricsTicker resolution, market cap, revenue, P/E, balance sheet, income statement, cash flow. We compute 100+ derived metrics (ROIC, WACC, Piotroski, Altman Z, margins, growth CAGRs) from the underlying statements.
Recent news
Multi-source, relevance-rankedWe pull from multiple licensed news sources in parallel and run a multi-stage relevance pipeline: HTML stripped, recency validated, each article scored, near-duplicates merged, then a per-article plain-English summary generated. Higher-tier publishers (Reuters, Bloomberg, FT) get a small ranking boost.
Leadership & compensation
CEO, CFO, exec team + total compOfficer roster, roles, and total compensation. Fail-soft when individual fields are missing — we return what we have rather than blocking the report.
AI synthesis
Plain-English report assemblyClaude Sonnet 4.5 receives a curated subset of the financial output (16 key metrics — margins, FCF, ROIC, health score, Altman Z, Piotroski) plus the per-article summaries. Returns a strict JSON schema with summary, what they do, financial health narrative, 4 risks, 4 opportunities, confidence score (1–10), and inline [Source] citations on every claim.
News relevance pipeline
Most news APIs return noise — irrelevant articles, duplicates, snippets that are HTML soup. Sentinellis runs every article through several stages before it reaches the AI synthesis step.
Fetch (parallel)
Two queries in parallel — company name + ticker — to NewsData.io with 8s timeout each. Dedup by URL.
Recency validation
Drop articles outside the 7-day window. NewsData.io filtering isn't always strict, so we re-validate post-fetch.
Relevance scoring
Each surviving article scored 1–10 by Claude Haiku in parallel. Articles below 6 dropped.
Story deduplication
Group near-duplicate stories. Prefer the higher-tier source (T1 Reuters/Bloomberg/AP over T2 CNBC/Forbes over T3 blog aggregators).
Volume cap
Top 8 articles by relevance score, then by recency. Free-tier NewsData.io caps at 10 results per query, so we usually have headroom.
Per-article summaries
Each article summarized to one paragraph in plain English via Claude Haiku in parallel. The AI synthesis step uses these summaries — never the raw snippet.
Synthesis & inline citations
Once the data fetches complete, the AI synthesis step receives their output and produces the final report.
Model
claude-sonnet-4-5-20250929 with max_tokens=2048 and a strict JSON schema that enforces inline [Source] tags on every factual claim.
Citations enforced at the schema level
The schema requires every risk, opportunity, financial claim, and historical statement to carry an inline citation pointing back to either Yahoo Finance, an article ID, or the company overview. Outputs missing citations are rejected and re-prompted.
What synthesis can't do
The system prompt explicitly blocks: analyst price targets, buy/sell/hold ratings, personalized investment advice, and speculation beyond what the source data supports. If a metric is missing, the report flags it rather than guessing.
Cache, retries, fail-soft
Redis cache (24h)
If the same company was analyzed in the last 24 hours, the cached report returns instantly and doesn't count against your quota. Fail-open: cache outage just means a fresh run.
Bounded retries
Each data fetch retries up to 3 times on transient errors (429, 5xx, timeouts). Permanent errors (400, 401, 403) are non-retryable — we fail fast and report the gap.
Fail-soft synthesis
If Claude returns a hard error (auth, malformed JSON, content policy), the report still ships with financials, news, and executives — just with an "analysis unavailable" banner instead of a missing section.
Performance
~32s
<200ms
80s
24h
See it in action
Run your first report free. Three included on signup, no credit card required.
Get started