, every metric looked perfect.
RWSS = 1.000. Output probabilities unchanged. No labels moved.
Everything said “all clear.”
Then the alert fired anyway.
Window 3: severity=warning RWSS=1.000 fired=True ← FIDI Z fires here
The model’s predictions didn’t know anything was wrong yet.
But the symbolic layer did.
This is what actually happened in the experiment — and why it matters for anyone running fraud models in production.
Full code: https://github.com/Emmimal/neuro-symbolic-drift-detection
TL;DR: What You Will Get From This Article
- FIDI Z-Score detects concept drift in 5 of 5 seeds, sometimes before F1 drops, with zero labels required
- RWSS alone missed 3 of 5 seeds. A Z-score extension of FIDI is what makes it work
- Covariate drift is a complete blind spot. It needs a separate raw-feature monitor
- The alert system is ~50 lines of code and the difference between a scheduled retrain and an emergency one
Not familiar with the series? Hybrid Neuro-Symbolic Fraud Detection: Guiding Neural Networks with Domain Rules covers the architecture. How a Neural Network Learned Its Own Fraud Rules: A Neuro-Symbolic AI Experiment explains how the model discovers its own rules. This is the drift detection chapter.
The Story So Far
This is Part 3 of a series. New here? One paragraph is all you need.
A HybridRuleLearner trains two parallel paths: an MLP for detection and a rule path that learns symbolic IF-THEN conditions from the same data. The rule path found V14 on its own across two seeds, without being told to look for it. That learned rule (IF V14 < −1.5σ → Fraud) is now the thing being monitored. This article asks what happens when V14 starts behaving differently.
Can the rules act as a canary? Can neuro-symbolic concept drift monitoring work at inference time, without labels?
Three Ways Fraud Can Change
Concept drift fraud detection is harder than it sounds because only one of the three common drift types actually changes what the model’s learned associations mean. The experiment simulates three types of drift on the Kaggle Credit Card Fraud dataset (284,807 transactions, 0.17% fraud rate) across 8 progressive windows each [9].
Covariate drift. The input feature distributions shift. V14, V4, and V12 move by up to +3.0σ progressively. Fraud patterns stay the same. The world just looks a little different.
Prior drift. The fraud rate increases from 0.17% toward 2.0%. Features are unchanged. Fraud becomes more common.
Concept drift. The sign of V14 is gradually flipped for fraud cases across 8 windows. By the end, the transactions the model learned to flag as fraud now look like legitimate ones. The rule IF V14 < −1.5σ → Fraud is now pointing in the wrong direction.
That third one is the one that should worry you in production. With covariate and prior drift, there are external signals. Input distributions shift, or fraud rates visibly change. You can monitor those independently. Concept drift leaves no such footprint. The only thing that changes is what the model’s learned associations mean. You will not know until F1 starts falling.
Unless something sees it first.
The Problem With the First Three Metrics
The model from How a Neural Network Learned Its Own Fraud Rules: A Neuro-Symbolic AI Experiment produced three label-free monitoring signals as a by-product of the symbolic layer. The idea: if the rules are learning fraud patterns, changes in how those rules fire should reveal when fraud patterns are shifting.
I expected the first one to be the early warning. It was not.
The problem is specific to how this model trains. All five seeds converged between epochs 3 and 10 (Val PR-AUC: 0.7717, 0.6915, 0.6799, 0.7899, 0.7951), when temperature τ is still between 3.5 and 4.0. At that temperature, rule activations are soft. Every input produces a near-identical activation score regardless of its actual features. In plain terms: the rules were firing almost the same way on every transaction, clean or drifted. A similarity metric on near-constant vectors returns 1.000 almost all the time. The first signal only fired in 2 of 5 seeds for concept drift, and in both cases it was the same window as F1 or later.
Why high temperature makes monitoring harder
The
LearnableDiscretizeruses a sigmoid gated by temperature τ:σ((x − θ) / τ). At τ = 5.0 (epoch 0), that sigmoid is nearly flat — every feature value produces an activation close to 0.5 regardless of where it sits relative to the learned threshold. As τ anneals toward 0.1, the sigmoid sharpens into a near-binary step. Early stopping fires at τ ≈ 3.5–4.0 — before the rules have fully crystallised. The result: activation vectors are near-constant across all inputs, so any similarity metric between them stays near 1.000 even when fraud patterns are genuinely shifting.
The second signal had the opposite problem. The absolute change in any feature’s contribution is tiny (values in the 0.001–0.005 range) because the rule weights themselves are small at an early-stopped checkpoint. In plain terms: the signal was real but invisible at the scale we were measuring it. A fixed absolute threshold of 0.02 never fires.
Here is what those three original signals are:
- RWSS (Rule Weight Stability Score): cosine similarity between the baseline mean rule activation vector and the current one. In simple terms: are the rules still firing the same way they did on clean data?
- FIDI (Feature Importance Drift Index): how much each feature’s contribution to rule activations has changed from the baseline. In simple terms: has any specific feature become more or less important to the rules?
- RFR (Rule Firing Rate): what fraction of transactions fire each rule.
That diagnosis led to the right question. Instead of asking “has FIDI changed by more than X?”, the right question is “has FIDI changed by more than X standard deviations from its own history?”
That question has a different answer. And the answer is V14.
The Metrics: Building a Label-Free Drift Detection System
Three new metrics joined the original three.
RWSS Velocity measures the per-window rate of change: RWSS[w] − RWSS[w−1]. A sudden drop of more than 0.03 per window fires an alert even before the absolute value crosses the threshold. If RWSS is falling at −0.072 in one step, that is a signal regardless of where it started.
FIDI Z-Score is the one that actually worked. Rather than a brand new signal, it is a simple extension of FIDI using Z-score normalisation against the feature’s own window history. Instead of asking whether the absolute change crosses a fixed threshold, it asks whether the change is anomalous relative to what that feature has been doing. Unlike traditional drift detection methods that rely on input distributions or output labels, this approach operates purely on the symbolic layer, which means it works at inference time, with no ground truth required. It builds on differentiable rule-learning work including ∂ILP [3], FINRule [4], RIFF [5], and Neuro-Symbolic Rule Lists [6], extending those representations with Z-score normalisation rather than fixed thresholds. V14’s contribution to rule activations during the clean baseline windows is small and flat. Near zero, stable, predictable. When concept drift begins at window 3, it shifts. Not by much in absolute terms. But by 9.53 standard deviations relative to the history it built during stable windows. That is an enormous relative anomaly, and no threshold calibration is needed to catch it.
PSI on Rule Activations was designed to catch distributional shift in the symbolic layer before the MLP’s compensation masks it at the output level. It did not work here. The soft activations from early-stopped training (τ ≈ 3.5–4.0 at the saved checkpoint) cluster near 0.5, producing near-uniform distributions that PSI cannot distinguish. PSI_rules = 0.0049 throughout the entire experiment. PSI_rules never fired. It is in the codebase for when models with fully crystallised rules (τ < 0.5) are available. In this experiment it contributed nothing.
The intended detection order, from earliest to latest:
RWSS Velocity → FIDI Z-Score → PSI(rules) → RWSS absolute → F1 (label-based)
Here is what actually happened.
Results: What Each Metric Did
Concept Drift
| Seed | F1 fires | RWSS fires | VEL fires | FIDIZ fires | PSIR fires |
|---|---|---|---|---|---|
| 42 | W3 | W4 (1w late) | W4 (1w late) | W3 (simultaneous) | — |
| 0 | W3 | — | — | W3 (simultaneous) | — |
| 7 | W4 | W4 (simult.) | W4 (simult.) | W3 (+1w early) | — |
| 123 | W3 | — | — | W3 (simultaneous) | — |
| 2024 | W4 | — | — | W3 (+1w early) | — |
FIDI Z-Score fires in 5 of 5 seeds, always at window 3. F1 fires at W3 in three seeds and W4 in two. The mean FIDIZ detection lag is +0.40 windows, meaning it leads F1 on average. In seeds 7 and 2024 it fires one full window before F1 drops. In the remaining three seeds it fires simultaneously. It never fires after F1 for concept drift. Not once.
Across all drift types, FIDI Z-Score is the only metric that detected concept drift in every seed and never lagged behind F1. For label-free drift detection fraud systems, that is the headline result.
RWSS fires in 2 of 5 seeds and in both cases simultaneously with or after F1. Velocity matches RWSS exactly, same window, every time. PSI on rule activations never fires at all.
Concept Drift vs Covariate Drift: Why Symbolic Monitoring Has Blind Spots
Covariate drift is where the symbolic layer goes completely silent.
Every symbolic metric: 0 of 5 seeds. Not one signal. Not one window. F1 eventually fires in 4 of 5 seeds at W6 or W7, slowly and late, and the symbolic layer had nothing to do with it. This is not a gap that better tuning will close. It is a fundamental property of what the symbolic layer measures.
The reason is mechanical. When V14, V4, and V12 shift by +3.0σ, the shift is uniform across all samples. The learnable discretizer computes thresholds relative to the data. Each sample still lands in roughly the same threshold bin relative to its neighbours. Rules fire on approximately the same proportion of transactions. Nothing in the activation pattern changes. Cosine similarity of mean activations stays at 1.0.
In simple terms: if every transaction shifts by the same amount, the rules still see the same relative picture. Transaction A was above the threshold before. It is still above the threshold after. The fraud-vs-legitimate ordering is preserved. RWSS measures that ordering, not the absolute values. Think of it as a tide that lifts all boats equally. The boats stay in the same order. RWSS only measures the order.
If covariate drift is a concern in your deployment, you need a separate input-space monitor: PSI on raw features, a KS test on V14, or a data quality check. The symbolic layer cannot help you there. Symbolic layer drift monitoring has one blind spot, and covariate shift is it.

Prior Drift
FIDIZ fires in 5 of 5 seeds, always at W3. But prior drift causes F1 to drop at W0 (seed 123) or W2 (seed 2024) in the two seeds where F1 fires at all. FIDIZ detection lag for prior drift: −2.00 windows. It fires two windows after F1.
This is not a calibration problem. FIDIZ needs a minimum of 3 clean windows to build a history before its Z-score is meaningful. Prior drift that causes an immediate fraud rate jump is already visible in F1 before FIDIZ can even start computing. A rolling fraud rate counter will always be faster here.

The Alert Demo: Window 3
Here is the moment the whole system was built for.
DriftAlertSystem is built once from the validation set immediately after training. It stores the baseline. Then .check() is called on each new window. No labels. No retraining. This is inference-only drift detection: the system reads the symbolic layer and nothing else.
Seed 42, concept drift, 8 windows:
Window 0: severity=none RWSS=0.999 fired=False
Window 1: severity=none RWSS=0.999 fired=False
Window 2: severity=none RWSS=0.999 fired=False
Window 3: severity=warning RWSS=1.000 fired=True ← FIDI Z fires here
Window 4: severity=critical RWSS=0.928 fired=True ← RWSS absolute confirms
Window 5: severity=warning RWSS=0.928 fired=True
Window 6: severity=warning RWSS=0.928 fired=True
Window 7: severity=warning RWSS=0.928 fired=True
At window 3, RWSS is exactly 1.000. The activation pattern is perfectly identical to baseline. Output probabilities have not changed. Nothing in the standard monitoring stack has moved.
And the alert fires at WARNING severity.
The reason is V14. Its Z-score is −9.53. That means V14’s contribution to rule activations has shifted to nearly 10 standard deviations below the baseline it established during clean windows. The model’s output does not know yet. The MLP is compensating. But the rule path cannot compensate. It was trained to express a fixed symbolic relationship. It is screaming.
One window later, the MLP stops holding. RWSS drops to 0.928. Velocity falls 0.072 in one step. Severity escalates to CRITICAL.
═══════════════════════════════════════════════════════
DRIFT ALERT | severity: CRITICAL
Earliest signal: VELOCITY
═══════════════════════════════════════════════════════
── Early-Warning Layer ─────────────────────────────
RWSS Velocity : -0.0720 [threshold -0.03] ⚠ FIRED
FIDI Z-Score : ⚠ FIRED
V14 Z = -9.53
PSI (rules) : 0.0049 [moderate≥0.10] stable
── Confirmed Layer ─────────────────────────────────
RWSS absolute : 0.9276 [threshold 0.97] ⚠ FIRED
Rules gone silent: 0 OK
Mean RFR change : -0.001
Recommended action:
→ Retrain immediately. Do not deploy.
═══════════════════════════════════════════════════════
The report names VELOCITY as the earliest layer. That is a priority order in the internal logic. In actual window timing, FIDI Z-Score fired one window earlier at W3. The W3 WARNING is the earlier human-facing alert. The one that gives you time to act before the CRITICAL fires.

Why FIDI Z-Score Sees It Before F1 Does
The model has two paths running in parallel from the same input.
The MLP path carries 88.6% of the final output (mean α = 0.886 across seeds; α is the learned blend weight; 0.886 means the neural network does 88.6% of the prediction work and the symbolic rules do the remaining 11.4%). When concept drift gradually reverses V14’s relationship to fraud labels, the MLP, trained on 284,000 transactions, partially absorbs that change. Its internal representations shift. Output probabilities stay roughly stable for at least one window. This is the MLP compensating.
The rule path carries 11.4%. It was trained to express the MLP’s knowledge in symbolic form: V14 below a threshold means fraud [2]. That relationship is fixed and explicit. When V14 flips sign for fraud cases, the rule’s V14 contribution does not adjust. It simply stops working. The bit activations for V14 change direction. The rule starts firing on the wrong transactions.
The neural network adapts. The symbolic layer does not. And that is exactly why the symbolic layer detects the drift first.
That asymmetry is what FIDI Z-Score exploits.
The absolute change in V14’s contribution is tiny (values in the 0.001 to 0.005 range) because rule weights are small at an early-stopped checkpoint. A fixed absolute threshold never catches it.

But V14’s history through the clean windows is just as flat. When concept drift moves it at window 3, the Z-score is −9.53. Same pattern: near-zero absolute change, extreme relative shift.
The symbolic layer is less compensating than the MLP, so it shows the drift first. FIDI Z-Score makes the signal visible by comparing each feature not to a fixed threshold, but to its own history.
But this only holds for one of the three drift types. The other two are a different story entirely.
What This System Cannot Do
A system that claims early warning invites overstatement. Here is what the data actually says. This is label-free anomaly detection fraud monitoring, which means the constraints are structural, not tunable.
Covariate drift is a complete blind spot. 0 of 5 seeds. The mechanism is explained in the Results section above. Use PSI on raw features or a KS test on V14 instead.
FIDIZ fires late on prior drift by design. When the fraud rate jumps, F1 reacts at W0 or W2. FIDIZ structurally cannot fire before W3. It needs history that does not yet exist. A rolling fraud rate monitor responds faster.
PSI on rule activations produced nothing. PSI_rules = 0.0049 throughout every window of every seed. Soft activations from early-stopped training cluster near 0.5, and PSI on near-uniform distributions is insensitive regardless of what is actually happening. This metric is in the codebase and may work with fully annealed models (τ < 0.5). In this experiment it was silent.
5 seeds is evidence, not proof. FIDIZ fires at W3 for concept drift across all 5 seeds. That is consistent and encouraging. It is not the same as reliable in production across datasets, fraud types, and drift severities you have not tested. 5 seeds is a starting point, not a conclusion. More seeds, more drift configurations, and real-world validation are needed before strong deployment claims.
Results Summary
The pattern is clearest when stated plainly first. Think of this as an early warning concept drift system with three distinct modes depending on what is changing. Covariate drift: the symbolic layer saw nothing, F1 caught it slowly. Prior drift: the symbolic layer fired after F1, not before. Concept drift: FIDI Z-Score fired in every single seed, always at or before F1, averaging +0.40 windows of lead time.
| Drift type | F1 fired | RWSS fired | FIDIZ fired | FIDIZ mean lag |
|---|---|---|---|---|
| Covariate | 4/5 | 0/5 | 0/5 | — |
| Prior | 2/5 | 0/5 | 5/5 | −2.00w (late) |
| Concept | 5/5 | 2/5 | 5/5 | +0.40w (early) |
Lag = windows before F1 alert. Positive = FIDIZ fires first. Negative = F1 fires first.

Building It
The system is designed to be used in production, not just in a notebook.
# Once, immediately after training
X_val_t = torch.FloatTensor(X_val)
alert_system = DriftAlertSystem.from_trained_model(model, X_val_t, feature_names)
alert_system.save("results/drift_alert_baseline_seed42.pkl")
# Every scoring run — weekly, daily, per-batch
alert_system = DriftAlertSystem.load("results/drift_alert_baseline_seed42.pkl")
alert = alert_system.check(model, X_this_week)
if alert.fired:
print(alert.report())
No labels. No retraining. No infrastructure beyond saving a pickle file next to the model checkpoint. The .check() call computes RWSS velocity, FIDI Z-Score, PSI on activations, and RWSS absolute in that order, using PyTorch [7] and scikit-learn [8]. Severity escalates from none to warning to critical based on how many fire and how far RWSS has dropped.
The three early-warning computations are each a few lines.
RWSS Velocity: rate of change per window.
def compute_rwss_velocity(rwss_history: List[float]) -> float:
if len(rwss_history) < 2:
return 0.0
return float(rwss_history[-1] - rwss_history[-2])
# Alert fires when drop > 0.03 per window
vel_fired = rwss_velocity < -0.03
FIDI Z-Score: normalise feature contribution anomaly against history.
def compute_fidi_zscore(fidi_history, current_fidi, min_history=3):
if len(fidi_history) < min_history:
return {k: 0.0 for k in current_fidi}
z_scores = {}
for feat_idx, current_val in current_fidi.items():
history_vals = [h.get(feat_idx, 0.0) for h in fidi_history]
mean_h = np.mean(history_vals)
std_h = np.std(history_vals)
z_scores[feat_idx] = (current_val - mean_h) / std_h if std_h > 1e-8 else 0.0
return z_scores
# Alert fires when any feature Z > 2.5
fidi_z_fired = any(abs(z) > 2.5 for z in z_scores.values())
PSI on Rule Activations: distributional shift in the symbolic layer (included for completeness).
def compute_psi_rules(baseline_acts, current_acts, n_bins=10):
bins = np.linspace(0, 1, n_bins + 1)
psi_per_rule = []
for r in range(baseline_acts.shape[1]):
b = np.histogram(baseline_acts[:, r], bins=bins)[0] + 1e-6
c = np.histogram(current_acts[:, r], bins=bins)[0] + 1e-6
b /= b.sum(); c /= c.sum()
psi_per_rule.append(float(np.sum((c - b) * np.log(c / b))))
return np.mean(psi_per_rule)
V14: Three Articles, One Feature
This is the part I did not plan. But V14 concept drift behaviour turns out to be the thread that ties all three articles together.
Guiding Neural Networks with Domain Rules: I wrote rules about large transaction amounts and anomalous PCA norms. Reasonable intuitions. Nothing to do with V14.
How a Neural Network Learned Its Own Fraud Rules: The model found V14 anyway. Given 30 anonymised features and no guidance, the gradient landed on the one feature with the highest absolute correlation to fraud. Twice, across two independent seeds.
This Article, I deliberately made V14 break. I flipped its sign for fraud cases, gradually, across 8 windows. And FIDI Z-Score registered the collapse at −9.53 standard deviations while RWSS was still 1.000 and F1 had not moved.

The same feature, three different roles: ignored, discovered, then monitored as the first thing to fail. That coherence was not engineered. It is what reproducible multi-seed evaluation on a consistent dataset keeps producing.
What to Do With This
Use FIDI Z-Score for concept drift detection without labels. It fires in 5 of 5 seeds, requires only 3 windows of history, never fires after F1, and needs no labels. Keep the Z-score threshold at 2.5 and minimum history at 3 windows.
Add a separate input-space monitor for covariate drift. PSI on raw features or a KS test on critical features like V14. The symbolic layer is blind to distributional shifts that preserve relative activation order.
Use a rolling fraud rate counter for prior drift. FIDIZ structurally cannot fire before W3. A label-based rate counter fires at W0.
Build the alert baseline immediately after training. Not after drift is suspected. Do it after training. If you wait, you have already lost your clean reference point. Save it alongside the checkpoint file.
One window of early warning is real. Whether it is one week or one day depends on your scoring cadence. For most production fraud teams, the difference between a scheduled retrain and an emergency one is measured in exactly those units.
Three Things That Will Catch You Using This Concept Drift Early Warning System
The 3-window blind period. FIDIZ has no history to work with for the first 3 windows after deployment. You are monitoring with RWSS and RFR only during that time. Plan for it explicitly.
Soft activations will silence PSI_rules. If your best checkpoint arrives when τ ≥ 1.0 (which happens whenever early stopping fires before training is complete), rule activations cluster near 0.5 and PSI_rules returns noise. Check τ at your saved checkpoint. In this experiment τ was still 3.5–4.0 at convergence. That is why PSI_rules was silent throughout.
Retrain means re-audit. This system is a fraud model retraining trigger, not a retrain replacement. After retraining, the rules change. V14 may no longer dominate, or new features may have entered. The compliance sign-off from the previous model does not carry forward. Build the audit into the retrain process, not as a step after, but as the step that closes the loop.
Closing
Three articles. One feature kept appearing.
Guiding Neural Networks with Domain Rules: I ignored it. How a Neural Network Learned Its Own Fraud Rules: The gradient found it. Article 3: When it broke, the symbolic layer noticed before the output layer did.
The experiment has a specific, honest scope: FIDI Z-Score detects concept drift in 5 of 5 seeds, sometimes one window before F1, never after it, entirely without labels. For covariate drift it is blind. For prior drift it is late. Those are not caveats added at the end to soften the claim. They are findings that tell you exactly where to use this and where not to.
A neuro-symbolic model gives you two channels. The MLP is better at prediction. The symbolic layer is better at knowing when prediction is about to go wrong. They are not redundant. They are watching different aspects of the same problem.
The MLP compensates. The symbolic layer cannot. That is its weakness. In this experiment, it turned out to also be its earliest warning.
Series
Disclosure
This article is based on independent experiments using publicly available data (Kaggle Credit Card Fraud dataset, CC-0 Public Domain) and open-source tools (PyTorch, scikit-learn). No proprietary datasets, company resources, or confidential information were used. The results and code are fully reproducible as described. The views and conclusions expressed here are my own and do not represent any employer or organisation.
References
[1] Dal Pozzolo, A. et al. (2015). Calibrating Probability with Undersampling for Unbalanced Classification. IEEE SSCI. Dataset: https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud (CC-0)
[2] Alexander, E. P. (2026). Hybrid Neuro-Symbolic Fraud Detection: Guiding Neural Networks with Domain Rules. Towards Data Science. https://towardsdatascience.com/hybrid-neuro-symbolic-fraud-detection-guiding-neural-networks-with-domain-rules/
[3] Evans, R., & Grefenstette, E. (2018). Learning Explanatory Rules from Noisy Data. JAIR, 61, 1–64. https://arxiv.org/abs/1711.04574
[4] Wolfson, B., & Acar, E. (2024). Differentiable Inductive Logic Programming for Fraud Detection. arXiv:2410.21928. https://arxiv.org/abs/2410.21928
[5] Martins, J. L., Bravo, J., Gomes, A. S., Soares, C., & Bizarro, P. (2024). RIFF: Inducing Rules for Fraud Detection from Decision Trees. In RuleML+RR 2024. arXiv:2408.12989. https://arxiv.org/abs/2408.12989
[6] Xu, S., Walter, N. P., & Vreeken, J. (2024). Neuro-Symbolic Rule Lists. arXiv:2411.06428. https://arxiv.org/abs/2411.06428
[7] Paszke, A. et al. (2019). PyTorch: An Imperative Style, High-Performance Deep Learning Library. NeurIPS 32. https://pytorch.org
[8] Pedregosa, F. et al. (2011). Scikit-learn: Machine Learning in Python. JMLR, 12, 2825–2830. https://scikit-learn.org
[9] Gama, J. et al. (2014). A Survey on Concept Drift Adaptation. ACM Computing Surveys, 46(4). https://dl.acm.org/doi/10.1145/2523813
Code: https://github.com/Emmimal/neuro-symbolic-drift-detection
If you work with production models: what drift type worries you most? Concept drift where the patterns quietly change, covariate shift in your input features, or something else? I am curious what monitoring gaps people are actually running into in real deployments.