How It Works

The pipeline behind every report

When you type a company name, multiple data sources are fetched in parallel and an AI synthesis pass produces the final report. Roughly 30–60 seconds later, you have a structured report with inline citations. Here's exactly what runs.

Architecture overview

Sentinellis is built on Temporal workflows — a durable execution engine. Every report is a workflow with automatic retry policies, timeouts, and failure recovery. If a data source is briefly down, Temporal retries with exponential backoff. If it's permanently unavailable, the workflow completes with a partial result and a confidence score reflecting what we couldn't fetch.

The orchestrator (AnalyzeCompanyWorkflow) launches the data fetches in parallel, waits for all of them (with bounded timeouts), then hands the combined output to the AI synthesis step.

Pipeline stages

Profile & financials

Fundamentals + 100+ derived metrics

Ticker resolution, market cap, revenue, P/E, balance sheet, income statement, cash flow. We compute 100+ derived metrics (ROIC, WACC, Piotroski, Altman Z, margins, growth CAGRs) from the underlying statements.

Recent news

Multi-source, relevance-ranked

We pull from multiple licensed news sources in parallel and run a multi-stage relevance pipeline: HTML stripped, recency validated, each article scored, near-duplicates merged, then a per-article plain-English summary generated. Higher-tier publishers (Reuters, Bloomberg, FT) get a small ranking boost.

Leadership & compensation

CEO, CFO, exec team + total comp

Officer roster, roles, and total compensation. Fail-soft when individual fields are missing — we return what we have rather than blocking the report.

AI synthesis

Plain-English report assembly

Claude Sonnet 4.5 receives a curated subset of the financial output (16 key metrics — margins, FCF, ROIC, health score, Altman Z, Piotroski) plus the per-article summaries. Returns a strict JSON schema with summary, what they do, financial health narrative, 4 risks, 4 opportunities, confidence score (1–10), and inline [Source] citations on every claim.

News relevance pipeline

Most news APIs return noise — irrelevant articles, duplicates, snippets that are HTML soup. Sentinellis runs every article through several stages before it reaches the AI synthesis step.

Fetch (parallel)

Two queries in parallel — company name + ticker — to NewsData.io with 8s timeout each. Dedup by URL.

Recency validation

Drop articles outside the 7-day window. NewsData.io filtering isn't always strict, so we re-validate post-fetch.

Relevance scoring

Each surviving article scored 1–10 by Claude Haiku in parallel. Articles below 6 dropped.

Story deduplication

Group near-duplicate stories. Prefer the higher-tier source (T1 Reuters/Bloomberg/AP over T2 CNBC/Forbes over T3 blog aggregators).

Volume cap

Top 8 articles by relevance score, then by recency. Free-tier NewsData.io caps at 10 results per query, so we usually have headroom.

Per-article summaries

Each article summarized to one paragraph in plain English via Claude Haiku in parallel. The AI synthesis step uses these summaries — never the raw snippet.

Synthesis & inline citations

Once the data fetches complete, the AI synthesis step receives their output and produces the final report.

Model

claude-sonnet-4-5-20250929 with max_tokens=2048 and a strict JSON schema that enforces inline [Source] tags on every factual claim.

Citations enforced at the schema level

The schema requires every risk, opportunity, financial claim, and historical statement to carry an inline citation pointing back to either Yahoo Finance, an article ID, or the company overview. Outputs missing citations are rejected and re-prompted.

What synthesis can't do

The system prompt explicitly blocks: analyst price targets, buy/sell/hold ratings, personalized investment advice, and speculation beyond what the source data supports. If a metric is missing, the report flags it rather than guessing.

Cache, retries, fail-soft

Redis cache (24h)

If the same company was analyzed in the last 24 hours, the cached report returns instantly and doesn't count against your quota. Fail-open: cache outage just means a fresh run.

Bounded retries

Each data fetch retries up to 3 times on transient errors (429, 5xx, timeouts). Permanent errors (400, 401, 403) are non-retryable — we fail fast and report the gap.

Fail-soft synthesis

If Claude returns a hard error (auth, malformed JSON, content policy), the report still ships with financials, news, and executives — just with an "analysis unavailable" banner instead of a missing section.

Performance

Median run

~32s

Cache hit

<200ms

Frontend timeout

80s

Cache TTL

24h

See it in action

Run your first report free. Three included on signup, no credit card required.

Get started

Docs hub Data sources