Synthetic demo report

ML Health Check — Sample Audit Report

A fraud detection model performs adequately on average, but shows measurable drift, weak recall on high-value fraud cases, and subgroup gaps that should be fixed before scaling.

ML Health Check

71 / 100

Needs Attention

Demo asset only. Synthetic data. Not a real customer report.

Performance

78/100

AUC is strong, recall is below target.

Drift

69/100

Digital goods traffic shifted sharply.

Bias & fairness

63/100

Recall gap on digital goods and gaming.

Production readiness

70/100

Monitoring and retention gaps remain.

Performance Metrics

Recall is 3.2 points below target. In a fraud use case, that gap means missed positives and avoidable exposure.

78/100

Findings

False negative rate is high for a payment fraud workflow.

High-confidence false negatives cluster around weekend evening transactions.

The model over-relies on patterns that no longer match the current transaction mix.

Recommended actions

Test a threshold change from 0.50 to 0.42 and measure recovered recall.

Add weekend and digital-goods features to the next model version.

Drift & Calibration

Current traffic differs from the reference window. The model is also overconfident above 0.60, which can create noisy escalation for review teams.

69/100

Findings

Digital goods share increased from 11% to 19%.

Positive prediction rate rose from 2.8% to 4.1%.

High-confidence bins overstate the actual positive rate.

Recommended actions

Retrain or fine-tune on recent digital-goods transactions.

Apply post-calibration and rerun reliability analysis quarterly.

Production Readiness

Logging and labels are usable, but monitoring and feature snapshot retention are too weak for reliable incident investigation.

70/100

Findings

Feature snapshots are retained for 7 days instead of 90 days.

No automated drift alert surfaced the distribution shift.

Recommended actions

Configure a weekly positive-rate drift alert with a 15% threshold.

Extend feature snapshot retention to at least 90 days.

Ready to run this on your own data?

Start free, then upgrade when you need the full report and export workflow.