Getting Started Guide

Everything you need to complete your Syntheticr trial

1/ Download

Receive your unique dataset link and review the data structure.

2/ Detect

Process the datasets through your AML system to identify suspicious entities.

3/ Scorecard

Upload entity IDs and receive an objective PDF scorecard by email.

OVERVIEW

  • One synthetic financial institution

  • Fixed dataset (approx. 3GB)

  • 24 months of transactions

  • Risk intelligence for first 18 months

  • Final 6 months unlabelled

Trial Scope

  • One alert submission

  • One PDF scorecard

  • 72-hour turnaround

  • Manual upload only

  • No API access

Trial Limts

/ STEP 1

Download and Understand the Dataset

After activating your trial, you will receive a unique Open DeltaShare link by email. The link is unique to you and remains active for 7 days

Dataset Characteristics

  • Size: 5–7GB

  • Structure: ISO20022 aligned

  • Components: Entity profiles, transactions, and risk intelligence

When joining tables, always include bank_id in your join conditions.

Syntheticr is engineered to mirror operational AML environments. These properties are deliberate and should be considered when processing the data.

Dataset Design

  • The dataset spans a time period of 24 months.

  • Risk intelligence is available for the first 18 months.

  • The final 6 months contains no risk intelligence to support unbiased evaluation.

  • The first 6 months act as a behavioural calibration period and does not contain alerts.

  • Risk intelligence includes true positives and false positives with realistic signal-to-noise ratios.

  • SAR filing does not automatically trigger exit.

  • Operational close-out windows may show post-exit transactions.

  • Counterparty completeness reflects institutional visibility limits.

Before processing the dataset, you may wish to review the schema and table definitions:

View Technical Reference →

Download Data Dictionary (XLSX) →

Technical References

/ STEP 2

Identify the Criminal Entities

Process the Syntheticr dataset exactly as you would production data in order to identify entities involved in money laundering within the synthetic dataset.

Generate alerts at a transaction or entity-level using your existing rules, models, or hybrid approach.

To ensure an accurate performance assessment, this process should replicate your normal operating model as closely as possible.

/ STEP 3

Request your Performance Scorecard

To request your scorecard, upload a CSV file containing the entity_ID values you have identified as suspicious to the Syntheticr App

Example Submission

entity_id

CUST-000234

CUST-004921

CUST-009115

Supported formats

  • CSV

  • Single column of entity_id values

  • Header row recommended

Submission Rules

  • Only include entities you suspect of being involved in criminality.

  • Do not submit more than 30% of the total population.

  • Only one submission is permitted during the trial.

After submission

  • Your file will be ingested, validated, and processed.

  • Your results will be compared against the ground truth.

  • Your scorecard (PDF) will be delivered via email within 72 hours.

ASSESSMENT

Your Scorecard

Your scorecard provides an objective assessment of the detection performance against known ground truth.

Overall Performance

Use this section to understand baseline detection and alert quality at a headline level. This section includes:

  • Entities submitted

  • Alert precision

  • Overall detection rate

  • Detection grade

Detection grades range from A* (95–100%) to E (<30%) and reflect entity-level detection capability.

This provides a clear baseline of current performance.

Performance by Dimension

Use this section to identify where performance is strong vs inconsistent across multiple dimensions:

  • Financial institution

  • Customer type (individual vs business)

  • Methodology (typology-level performance)

  • Transaction type

  • Risk intelligence (alerts, SARs, exits)

  • Network detection

Each section highlights strengths, gaps, and areas where detection may be inconsistent.

Interpreting Your Results

Precision measures how many of your alerts correctly identify criminal entities.

Detection rate measures how many criminal entities were successfully identified.

Precision and detection rate should be interpreted together to avoid optimising for one at the expense of the other:

  • High detection + low precision indicates over-alerting.

  • High precision + low detection indicates under-detection.

The scorecard measures detection capability only. It does not assess investigation quality or SAR drafting.

NEXT STEPS

After completing the trial

The Syntheticr trial includes data for one financial institution, and single PDF scorecard requested via a manual upload. The trial demonstrates the methodology and scoring framework. Paid plans extend scope, automation, frequency, and depth.

Starter is designed for occasional or ad‑hoc testing against a limited data scope.

  • Access to data for one institution

  • One scorecard per month

  • PDF scorecards

Pro is built for teams that test regularly as part of ongoing operational processes.

  • Access to data for two institutions

  • Configurable data quality

  • API-based submission

Enterprise supports continuous, automated testing at scale.

  • Access to data for six institutions

  • Machine-readable scorecards

  • API-based results

At the end of your 30-day trial:

  • Card-based subscriptions convert automatically unless cancelled.

  • Invoice-based subscriptions receive an invoice per agreed terms.

For questions about upgrading or extending access, please contact hello@syntheticr.ai