Surveil is built on a modular architecture that separates ingestion, entity resolution, signal detection, and intelligence generation into independent layers. The ingestion layer connects to news APIs, regulatory filing systems (SEC EDGAR, SEDAR+), patent databases (USPTO bulk data, WIPO PATENTSCOPE), and financial disclosure repositories — with rate limiting, pagination handling, and schema normalization per source.
Every ingested document is normalized into a unified schema with full provenance: source URL, crawl timestamp, content hash, extraction confidence, and credibility metadata. The entity resolution layer maps companies, people, and organizations across sources using a combination of deterministic matching (legal names, tickers, registration numbers) and embedding-based fuzzy resolution for mentions that use variants or abbreviations.
The signal detection layer applies classification models to identify business events — product launches, leadership changes, partnerships, regulatory actions, financial disclosures — and ranks them by competitive relevance, source credibility, and recency. Sentiment and narrative tracking models monitor topic trajectories over time, surfacing statistically significant shifts as they emerge. All models run locally. No data is transmitted to third-party AI services.
Intelligence generation assembles structured reports, competitive briefings, and regulatory watch summaries from detected signals and the underlying source corpus. Every insight carries full data lineage — traceable to the specific document, timestamp, and processing path that produced it. Deployment options include Canadian cloud (Azure Canada Central, AWS ca-central-1) and on-premise for organizations with data residency requirements.