Documentation

How it works

The pipeline behind every report.

When you type a company name, multiple data sources are fetched in parallel and an AI synthesis pass produces the final report. Roughly 30–60 seconds later, you have a structured report with inline citations. Here's exactly what runs.

Architecture overview

Sentinellis is built on a durable workflow engine. Every report is a workflow with automatic retry policies, timeouts, and failure recovery. If a data source is briefly down, the engine retries with exponential backoff. If it's permanently unavailable, the workflow completes with a partial result and a confidence score reflecting what we couldn't fetch.

The orchestrator launches the data fetches in parallel, waits for all of them (with bounded timeouts), then hands the combined output to the AI synthesis step.

Pipeline stages

Profile & financials

Fundamentals + 100+ derived metrics

Ticker resolution, market cap, revenue, P/E, balance sheet, income statement, cash flow. We compute 100+ derived metrics (ROIC, WACC, Piotroski, Altman Z, margins, growth CAGRs) from the underlying statements.

Recent news

Multi-source, relevance-ranked

We pull from multiple licensed news sources in parallel and run a multi-stage relevance pipeline: HTML stripped, recency validated, each article scored, near-duplicates merged, then a per-article plain-English summary generated. Higher-tier publishers (Reuters, Bloomberg, FT) get a small ranking boost.

Leadership & compensation

CEO, CFO, exec team + total comp

Officer roster, roles, and total compensation. Fail-soft when individual fields are missing — we return what we have rather than blocking the report.

AI synthesis

Plain-English report assembly

Our AI synthesis model receives a curated subset of the financial output (16 key metrics — margins, FCF, ROIC, health score, Altman Z, Piotroski) plus the per-article summaries. Returns a strict JSON schema with summary, what they do, financial health narrative, 4 risks, 4 opportunities, confidence score (1–10), and inline [Source] citations on every claim.

III

News relevance pipeline

Most news APIs return noise — irrelevant articles, duplicates, snippets that are HTML soup. Sentinellis runs every article through several stages before it reaches the AI synthesis step.

Fetch (parallel)

Multiple queries — company name plus ticker — run in parallel across our news providers, then are deduplicated by URL.

Recency validation

Articles outside a recent window are dropped, then re-validated after fetch since upstream filtering isn't always strict.

Relevance scoring

Each surviving article is scored for relevance in parallel by an AI model. Low-relevance articles are dropped.

Story deduplication

Near-duplicate stories are grouped, preferring higher-tier publishers (Reuters, Bloomberg, AP over blog aggregators).

Volume cap

The most relevant, most recent articles are kept and the rest are capped out to keep the report focused.

Per-article summaries

Each article is summarized to one plain-English paragraph in parallel. The AI synthesis step uses these summaries — never the raw snippet.

Synthesis & inline citations

Once the data fetches complete, the AI synthesis step receives their output and produces the final report.

Model

A frontier large language model runs with a strict JSON schema that enforces inline [Source] tags on every factual claim.

Citations enforced at the schema level

The schema requires every risk, opportunity, financial claim, and historical statement to carry an inline citation pointing back to a data source, an article, or the company overview. Outputs missing citations are rejected and re-prompted.

What synthesis can't do

The system prompt explicitly blocks: analyst price targets, buy/sell/hold ratings, personalized investment advice, and speculation beyond what the source data supports. If a metric is missing, the report flags it rather than guessing.

Cache, retries, fail-soft

Redis cache (24h)

If the same company was analyzed in the last 24 hours, the cached report returns instantly and doesn't count against your quota. Fail-open: cache outage just means a fresh run.

Bounded retries

Each data fetch retries up to 3 times on transient errors (429, 5xx, timeouts). Permanent errors (400, 401, 403) are non-retryable — we fail fast and report the gap.

Fail-soft synthesis

If the AI synthesis returns a hard error (auth, malformed JSON, content policy), the report still ships with financials, news, and executives — just with an "analysis unavailable" banner instead of a missing section.

Performance

Median run

~32s

Cache hit

<200ms

Frontend timeout

80s

Cache TTL

24h

Start

See it in action

Run your first report free. Three included on signup, no credit card required.

Get started

Docs hub Data sources