Highly-Engineered Synthetic Data

Ready-to-use datasets from a national-scale financial simulation

SPECIFICATION

At a glance

Additional Use Cases

Population

1 millon

Peer Benchmark

Understand how you perform relative to anonymised peers. Syntheticr provides a dedicated and controlled benchmark dataset and scores your outputs against a cohort of your peers.

Learn more

Hackathons

Give teams safe data with embedded typologies and ground truth for rapid innovation. Syntheticr has already proven its value in regulatory TechSprints and hackathon environments.

Learn more

Employee Training

Train analysts and investigators on real-world laundering typologies using realistic synthetic data with known outcomes. Build confidence in alert handling, transaction analysis, and SAR writing.

Learn more

Technology Demos

Safely simulate realistic transaction and alert volumes to test how your AML system behaves under scale without using production data.

Learn more

Volumetric Testing

Use Syntheticr to demonstrate detection workflows, network visualisation, and model behaviour with credible synthetic scenarios. Ideal for internal stakeholders or customer demos.

Learn more

Population

1 millon

Population — ~1M synthetic entities

  1. Institutions — 6 financial institutions

  2. Transactions — 1.5B+ over 24 months

  3. Typologies — embedded laundering networks

  4. Formats — ISO20022-aligned

  5. Risk-free — no real entities or activity

Section 3 — Overview

Heading (H2)
Overview

Body
Syntheticr is generated using a sophisticated agent-based modelling (ABM) approach grounded in proprietary input data derived from UK national statistics, open data, and global financial standards. The simulation produces a statistically accurate, privacy-safe financial ecosystem containing legitimate activity and hidden money-laundering networks.

Syntheticr is delivered as three distinct, interconnected datasets: (1) Entity Profiles, (2) Financial Transactions & Money Laundering, and (3) Risk Intelligence. Together they provide an authentic test environment without any privacy or compliance risk, with outputs aligned to modern financial and regulatory frameworks.

Section 4 — Population & Simulation Engine

Heading (H2)
Population & Simulation Engine

Body
Syntheticr begins with a baseline population of synthetic individuals, small to medium-sized businesses (SMBs), and financial institutions. Every entity has a unique identifier, statistically accurate attributes, and realistic network connections that mirror real-world ecosystems.

Bullets
Income distributions
Spending patterns
Geographic clustering
Social connections

Using this population, the ecosystem unfolds over 24 months, generating realistic financial behaviour and risk signals at production scale.

Section 5 — The three Syntheticr datasets (3 columns)

Column A — Entity Profiles

Heading (H3)
Entity Profiles

Body
Syntheticr provides high-coverage entity profiles aligned with KYC and relationship-mapping workflows.

Company Registry
Referentially integral registry data for the SMB population (~250k companies) captures real-world patterns of business relationships, directorships, and beneficial ownership. This enables sophisticated laundering schemes that exploit complex commercial structures.

Customer Profiles & Related Parties
Profiles are provided for all entities who are customers of the financial institutions, consistent with banking KYC conventions. They include beneficial owners and directorships for businesses, and family and household relationships for individuals (partners, children, parents, shared accommodation).

Why this matters
AML systems can be evaluated against realistic onboarding data and relationship graphs — not transactions in isolation.

Column B — Financial Transactions & Money Laundering

Heading (H3)
Financial Transactions & Money Laundering

Body
More than 1.5 billion transactions are executed across six financial institutions over a two-year period, each linked to comprehensive KYC profiles. Syntheticr covers the full spectrum of payment rails and behaviours used in modern financial services, allowing testing across every channel criminals exploit.

Payment rails included
Traditional rails: BACS, Faster Payments, CHAPS, and international SWIFT
Card payments: EPOS and online card transactions with authentic MCC categorisation
Digital payments: mobile payments, wallets, and peer-to-peer platforms
Cash: ATM withdrawals and branch deposits with realistic patterns
Automated payments: direct debits, standing orders, and recurring arrangements

Each rail maintains authentic characteristics including processing times, fee structures, and regulatory reporting behaviour, creating realistic signal-to-noise ratios.

Default labelling approach
Accounts engaging in laundering activity are not labelled by default, ensuring objective performance measurement and preventing over-fitting. Low-volume labelled versions can be provided for training, demonstrations, and hackathons.

Column C — Risk Intelligence

Heading (H3)
Risk Intelligence

Body
Syntheticr includes realistic risk artefacts used for AML workflow evaluation and model training, including alerts, SARs, and exits. Signal-to-noise ratios are deliberately distorted with false positives and false negatives to simulate real-world detection challenges.

Risk intelligence is limited to the first 18 months of the two-year simulation, creating a 6-month greenfield period for unbiased model training and testing.

Section 6 — Payment coverage (List grid)

Heading (H2)
Payment coverage

Intro line
Syntheticr includes all major transaction channels used by legitimate customers and money launderers.

List items (titles + short text)
Traditional rails
BACS, Faster Payments, CHAPS, SWIFT

Cards
EPOS and online card payments with authentic merchant categorisation

Digital payments
Mobile, wallet, and peer-to-peer transfers

Cash
ATM withdrawals and branch deposits with realistic patterns

Automated payments
Direct debits, standing orders, recurring arrangements

Section 7 — Money laundering typologies (top grid + accordion)

Heading (H2)
Money laundering typologies

Intro
Criminal networks and laundering behaviours are engineered to reflect real-world patterns documented by regulators and law enforcement. Activity is embedded within legitimate flows to mirror real signal-to-noise and detection complexity.

Top typologies (8–10 grid items)
Trade-based money laundering
Smurfing and structured deposits
Professional laundering organisations
Digital payment exploitation
Cash-intensive business fronts
Layering via shell companies and trusts
APP fraud laundering networks
Trade-finance abuse
Real-estate laundering

Accordion title
See full typology library

Accordion body (paste full list)
Trade-Based Money Laundering: Over and under-invoicing schemes, multiple invoicing, phantom shipments
Smurfing Operations: Coordinated teams making structured deposits below reporting thresholds across multiple institutions
Professional Money Laundering Organizations: Multi-layered networks with specialized roles for collection, processing, and integration
Digital Payment Exploitation: Abuse of online payment platforms, prepaid cards, and mobile money services
Cash-Intensive Business Fronts: Restaurants, retail stores, and service businesses with inflated revenue reporting
Layering Through Investment Vehicles: Complex chains involving shell companies, trusts, and offshore entities
Authorized Push Payment (APP) Fraud Networks: Romance scams, invoice fraud, CEO fraud with subsequent laundering
Trade Finance Abuse: Letters of credit manipulation, documentary fraud, phantom trade transactions
Real Estate Laundering: Property transactions with inflated values and straw purchasers

Section 8 — Compatibility & integration

Heading (H2)
Compatibility & integration

Body
Syntheticr conforms to industry-standard formats aligned to ISO20022 messaging, so you can load datasets directly into modern AML platforms and data pipelines. This supports fast testing cycles, repeatable benchmarking, and seamless alignment with regulatory frameworks.

Section 9 — Safety & compliance

Heading (H2)
Safety & compliance

Body
Syntheticr contains no real-world data on people, businesses, or activity. Every entity and transaction is simulated from statistical foundations and proprietary modelling, enabling regulated organisations to test and train systems without privacy, consent, or compliance risk.

Section 10 — CTA

Headline (H2)
See what your AML system can really detect.

Body
Start a free 30-day trial to run your first Performance Baseline scorecard, or book a technical walkthrough to explore the full schema and typology library.

Buttons
Start a 30-day trial
Book a technical walkthrough

See Syntheticr in action

Start a free 30-day trial and test your AML systems with our synthetic data and performance scorecards.

Learn More