What synthetic AML testing actually measures

Synthetic AML testing is often described in terms of data realism or privacy benefits. Those characteristics matter, but they are not the reason synthetic testing is useful. Synthetic AML testing matters because it enables objective measurement. It allows teams to measure detection performance directly, rather than relying on indirect signals from production outcomes.

This article explains what synthetic AML testing measures, how those measurements are produced, and how they should be used in practice.

The measurement problem in AML testing

In most AML environments, performance is inferred from operational outcomes. These include alert volumes, case decisions, investigator throughput, or SAR filings. These outputs are useful for running operations, but they are poor indicators of detection capability. They reflect process constraints, historical decisions, and investigation capacity as much as they reflect actual detection performance. Production data compounds this problem. Confirmed money laundering is rare, many laundering events are never detected, and many alerts do not correspond to criminal activity. As a result, production outcomes do not provide reliable ground truth.

Without ground truth, it is difficult to measure detection performance directly. This is why AML testing often relies on proxy indicators rather than evidence.

Why synthetic AML testing is different

Synthetic AML testing starts from a different foundation. Synthetic datasets are designed specifically for testing. They embed defined money laundering activity across transactions, customers, and networks. Because this activity is known in advance, system outputs can be evaluated directly against it. This enables measurement that is not possible with production data.

What synthetic AML testing measures

At its core, synthetic AML testing measures detection capability. This includes:

Whether laundering activity is detected - Synthetic testing measures whether entities involved in money laundering activity generate alerts. This supports clear entity-level detection metrics, rather than relying on alert volumes or investigation outcomes.

How consistently detection occurs - Performance can be assessed across scenarios, customer segments, products, and typologies. This helps teams understand whether detection is robust or uneven.

Where systems underperform - By testing known activity across different conditions, synthetic testing shows where detection breaks down or degrades. This allows teams to identify specific areas of weakness rather than relying on aggregate averages.

The trade-offs between detection and noise - Synthetic testing supports alert-level metrics such as precision alongside detection measures. This makes it possible to understand how changes affect both detection and unnecessary alerts.

Relative performance - Because the same dataset can be reused, systems, models, or configurations can be compared on a like-for-like basis. This enables fair benchmarking and defensible comparison.

What synthetic AML testing does not measure

It is equally important to be clear about what is not measured. Synthetic AML testing does not assess:

  • Investigator decision quality

  • Case handling effectiveness

  • SAR drafting or submission accuracy

  • Operational efficiency or staffing capacity

Those outcomes depend on processes, policies, and people. Synthetic testing focuses on detection capability, not downstream operations.

How these measurements are used

In practice, synthetic AML testing supports a small number of well-defined use cases. Teams use it to:

  • Establish a baseline view of current detection performance

  • Compare systems, vendors, or configurations using identical scenarios

  • Validate whether model or rule changes deliver genuine improvements

  • Monitor performance over time by repeating controlled tests

In each case, the goal is the same: obtain clear evidence of how an AML system behaves when faced with known activity.

Where Syntheticr fits

Syntheticr replaces production data as the basis for AML testing and evaluation.

The platform combines:

  • Synthetic datasets covering transactions, customer profiles, and risk intelligence, with embedded money laundering activity and known ground truth

  • Objective performance scorecards that measure detection capability directly and show where systems underperform and why

  • Workflows that support one-off assessment or repeatable testing over time

Syntheticr is designed to help teams answer practical questions with confidence, without relying on production data or proxy signals.

Learn more

To learn more about Syntheticr and how it is used to test and evaluate AML systems without production data, see our FAQs.

References and further reading

UK Financial Conduct Authority, Digital Sandbox

Alan Turing Institute, Synthetic Data for Anti-Money Laundering

FCA speech: “AI: Flipping the coin in financial services”

SynthAML benchmark dataset, Nature Scientific Data

Next
Next

Why production data is the wrong foundation for AML testing