Where the data comes from
Every Sentinellis report is assembled from a small set of public and licensed data providers. We don't scrape, we don't resell raw feeds, and we don't hide our sources. Here's the full list.
Our data principles
- Every claim in a report carries an inline citation.
- We prefer T1 sources (Reuters, Bloomberg, AP) when stories overlap.
- When a source is unavailable, we ship the report with a flagged gap rather than guessing.
- We never sell, share, or aggregate user data with these providers.
Active providers
Yahoo Finance
Accessed via yfinance Python package
Primary financial data
What it provides
- ·Ticker resolution (company name → symbol)
- ·Real-time quote: market cap, P/E, dividend yield
- ·Income statement (revenue, EBIT, net income, EPS)
- ·Balance sheet (assets, liabilities, equity)
- ·Cash flow statement (operating, investing, financing)
- ·Company officers (CEO, CFO, key VPs + total comp)
- ·Sector, industry, employee count, website (used for logo)
Updated within minutes for quotes; quarterly for statements (after company filings).
Sparse for non-US tickers, very small caps, and companies not listed on major exchanges. Returns HTTP 500 occasionally on individual fields — handled by fail-soft.
NewsData.io
Accessed via REST API
Primary news source
What it provides
- ·Articles from 60,000+ sources globally
- ·7-day recent window (free tier)
- ·Source tier metadata (T1 Reuters/Bloomberg, T2 CNBC/Forbes)
- ·Country + language filters
Articles indexed within hours of publication. We cap to a 7-day window for relevance.
Free-tier limit of 10 results per query. When NewsData is unreachable or returns 0 articles for a niche company, we fall back to Google News RSS.
Google News RSS
Accessed via Public RSS endpoint
News fallback when NewsData fails
What it provides
- ·Same major sources (Reuters, CNBC, MarketWatch, etc.)
- ·Headline + publication date + source
- ·URL to original article
Real-time aggregator. Same recency window enforced post-fetch.
Description fields contain HTML anchor tags rather than article snippets — we strip them and fall back to the headline. No relevance metadata, so all stages 2-6 of our pipeline still run.
Anthropic Claude
Accessed via Anthropic API
AI synthesis & relevance scoring
What it provides
- ·Claude Sonnet 4.5 → final report synthesis with inline citations
- ·Claude Haiku → news article relevance scoring (1–10)
- ·Claude Haiku → per-article plain-English summaries
- ·Strict JSON schema enforcement on all outputs
On-demand. Each report run triggers fresh inference — no AI output is cached separately from the report.
Subject to upstream rate limits and content policy. On hard errors, the report ships fail-soft with an "analysis unavailable" banner instead of a missing section.
Google Favicons
Accessed via Public endpoint
Company logos
What it provides
- ·128px favicon resolved from company website domain
- ·Used in report headers and OG share images
Fetched at report time, cached implicitly with the report.
Some companies have generic placeholder favicons. We previously used Clearbit Logo API but it was deprecated; Google Favicons is a near-perfect free replacement (a public company's favicon is almost always their logo).
What we don't use (yet)
SEC EDGAR direct ingestion
10-K and 10-Q filings flow through Yahoo Finance today. Direct EDGAR integration is on the roadmap — it would unlock richer segment data and longer historical series.
OpenCorporates / national registries
Legal-entity ground truth for private companies (ONRC for RO, Companies House for UK, KvK for NL). Designed but not yet wired into the pipeline. Priority for the B2B tier.
LinkedIn / Glassdoor scraping
We don't scrape. Executive data comes from Yahoo Finance (which sources from SEC DEF 14A filings for US-listed companies). LinkedIn ToS prohibits automated access without authorization, and we respect that.