Governance Loop – Flow & Configuration¶
This document describes the step‑by‑step execution of the ARF Core Engine’s governance loop, the constants that control its behaviour, and the expected loss minimisation decision rule.
Canonical definitions of
InfrastructureIntent,HealingIntent,RiskScore, andGovernanceLoopare incore_concepts.md.
Mathematical foundations of risk scoring and expected loss are detailed inmathematics.md.
Implementation status: The behaviour described here is implemented in the proprietary core engine. The engine is access‑controlled and available under outcome‑based pricing. This public specification describes what the engine does, not how it is implemented internally.
1. Overview¶
The governance loop processes an InfrastructureIntent and returns a HealingIntent containing:
- A recommended action (
APPROVE,DENY, orESCALATE) - A calibrated risk score (
RiskScore) with component decomposition - A detailed justification
- Full metadata for traceability (epistemic uncertainty, business impact, forecasts, etc.)
The loop is deterministic for a given input, ensuring reproducibility and auditability.
2. Step‑by‑Step Execution¶
InfrastructureIntent
↓
Cost Estimation
↓
Policy Evaluation
↓
Risk Calculation (Bayesian fusion)
↓
Predictive Foresight (optional)
↓
Business Impact Estimation
↓
Epistemic Uncertainty (optional)
↓
Expected Loss Minimisation
↓
Decision Selection
↓
Semantic Memory Retrieval (optional)
↓
HealingIntent
Step 1 – Cost Estimation¶
cost_estimator.estimate_monthly_cost(intent) returns a projected monthly cost (if available).
Failures are logged but do not block evaluation.
Step 2 – Policy Evaluation¶
policy_evaluator.evaluate(intent, policy_context) returns a list of policy violations (empty if none).
If any violations exist, the loop immediately sets the recommended action to DENY – policy rules are hard constraints that override all other factors.
Step 3 – Risk Calculation¶
risk_engine.calculate_risk(intent, cost_estimate, policy_violations) returns:
risk_score– final Bayesian risk (posterior mean) in \([0,1]\)explanation– human‑readable justificationcontributions– dictionary with per‑component means (conjugate_mean,hyper_mean,hmc_prediction) and their weights
Posterior variance \(\text{Var}(p_c)\) is computed from the Beta parameters for the conjugate component.
The raw risk is smoothed using exponential decay before loss calculation (see Section 7.1).
Step 4 – Predictive Foresight (Optional)¶
If a SimplePredictiveEngine is configured, it forecasts service health for the next few time steps.
Each forecast has a risk_level (low, medium, high, critical) and a confidence in \([0,1]\).
Risk levels are mapped to numeric values:
| Risk Level | Numeric Risk |
|---|---|
low |
0.1 |
medium |
0.4 |
high |
0.7 |
critical |
0.95 |
The weighted average of these numeric risks, using softmax over confidences as weights, is multiplied by the mean confidence to yield a predictive_risk score.
Step 5 – Business Impact¶
Using the BusinessImpactCalculator, the loop estimates potential revenue loss \(b_{\text{mean}}\) based on current service telemetry (latency, error rate, throughput).
Step 6 – Epistemic Uncertainty (Optional)¶
If enable_epistemic is True, the loop computes a composite epistemic uncertainty score \(\psi_{\text{mean}}\) as:
where:
hallucination_riskis obtained from an ECLIPSE probe (if available) usingentropy,evidence_lift, andcontradictionsignals.forecast_uncertainty\(= 1 - \text{mean}(\text{confidence})\) of the forecasts.data_sparsity\(= \exp(-0.05 \cdot \text{len}(\text{history}))\) – decays exponentially with more historical data.
If any component is unavailable, it defaults to 0.
Step 7 – Expected Loss Calculation¶
For each possible action \(a \in \{\text{approve}, \text{deny}, \text{escalate}\}\), the loop computes an expected loss using configurable cost constants.
7.1 Risk Decay (Exponential Smoothing)¶
To avoid single‑session outliers dominating decisions, the raw risk score \(R_t\) is smoothed using a per‑component exponential moving average:
with default \(\gamma = 0.9\) (configurable via RISK_DECAY_FACTOR). For the first evaluation of a component, \(\tilde{R}_0 = R_0\).
The decayed risk \(\tilde{R}\) is used in the loss formulas below.
7.2 Expected Loss Formulas¶
where:
- \(\tilde{R}\) = decayed risk (Section 7.1)
- \(\text{Var}(p_c)\) = posterior variance of the conjugate Beta distribution
- \(\psi_{\text{mean}}\) = epistemic uncertainty composite (Step 6)
- \(b_{\text{mean}}\) = estimated revenue loss from business impact
- \(v_{\text{mean}}\) = estimated opportunity value of the action (optional, default 0)
- \(\text{COST\_*}\) are configuration constants (see Section 3)
7.3 Enterprise Extension: CVaR for Tail Risk¶
When USE_CVAR is True (default: False), the approve loss is computed using Conditional Value at Risk (CVaR) instead of expectation. Posterior samples of the failure probability \(\tilde{p} \sim \text{Beta}(\alpha_c, \beta_c)\) are drawn, and the average of the worst \(\alpha\) fraction (typically \(\alpha = 0.05\)) of per‑sample losses is used:
This feature is disabled by default in the core engine and may be enabled by enterprise customers.
Step 8 – Decision Selection¶
The recommended action \(a^*\) is chosen in the following order of precedence:
- If any policy violation exists: \(a^* = \text{DENY}\) (hard constraint)
- Else if
USE_EPISTEMIC_GATEisTrueand \(\psi_{\text{mean}} > \text{EPISTEMIC\_ESCALATION\_THRESHOLD}\): \(a^* = \text{ESCALATE}\) - Else: \(a^* = \arg\min_{a} L_a\)
By default, USE_EPISTEMIC_GATE = True and EPISTEMIC_ESCALATION_THRESHOLD = 0.5.
Step 9 – Semantic Memory Retrieval (Optional)¶
If a RAGGraphMemory is available, the loop retrieves similar past incidents and adds them to the metadata for contextual explanation.
Step 10 – Build the HealingIntent¶
All data (risk factors, expected losses, epistemic breakdown, forecasts, business impact, etc.) are stored in the metadata field.
The risk_factors field contains additive contributions from each Bayesian component (conjugate, hyperprior, HMC), using the weights returned by the risk engine, providing full explainability of the final risk score.
3. Configuration Constants¶
The following constants control the behaviour of expected loss minimisation and epistemic gating. Exact values are proprietary and may be tuned per deployment. The values shown below are the defaults implemented in the proprietary core engine (as of the current release) and are provided for transparency.
| Constant | Description | Default Value |
|---|---|---|
COST_FP |
Cost of a false positive (approving a risky action) | 10.0 |
COST_FN |
Cost of a false negative (denying a safe action) | 8.0 |
COST_IMPACT |
Weight for business impact in approve loss | 5.0 |
COST_PREDICTIVE |
Weight for predictive risk | 2.0 |
COST_VARIANCE |
Weight for posterior variance | 5.0 |
COST_OPP |
Opportunity cost weight (used in deny loss) | 3.0 |
COST_REVIEW |
Cost of human review (escalate loss) | 2.0 |
COST_UNCERTAINTY |
Weight for epistemic uncertainty in escalate loss | 4.0 |
EPISTEMIC_ESCALATION_THRESHOLD |
If \(\psi_{\text{mean}}\) exceeds this, force escalate | 0.5 |
USE_EPISTEMIC_GATE |
Boolean to enable/disable epistemic‑based escalation | True |
RISK_DECAY_FACTOR (\(\gamma\)) |
Exponential smoothing factor for risk | 0.9 |
USE_CVAR |
Enable CVaR for approve loss | False |
CVAR_ALPHA |
Tail probability for CVaR | 0.05 |
Note: Enterprise customers may override these values via configuration to align with their business risk tolerance.
4. Example Output (HealingIntent)¶
{
"action": "ESCALATE",
"justification": "Risk score 0.38, epistemic uncertainty 0.45, expected losses: approve=18.2, deny=12.7, escalate=11.5",
"risk_score": 0.38,
"risk_factors": {
"conjugate": 0.22,
"hmc": 0.16
},
"metadata": {
"predictive_risk": 0.25,
"epistemic_breakdown": {
"hallucination": 0.1,
"forecast_uncertainty": 0.3,
"data_sparsity": 0.2
},
"decision_trace": {
"expected_losses": { "APPROVE": 18.2, "DENY": 12.7, "ESCALATE": 11.5 },
"selected_action": "ESCALATE"
}
}
}
5. Calibration Diagnostics (Proprietary)¶
The core engine maintains internal calibration metrics (reliability diagrams, Expected Calibration Error) to validate that the risk score can be interpreted as a calibrated probability. These diagnostics are used for model monitoring and are not exposed in the public API.
6. Related Code (Proprietary)¶
The governance loop is implemented in the proprietary core engine at:agentic_reliability_framework/core/governance/governance_loop.py
Configuration constants are defined in:agentic_reliability_framework/core/config/constants.py
These files are not public, but the behaviour described here matches their implementation.
7. See Also¶
-
core_concepts.md – canonical definitions of intents, risk score, and execution ladder
-
mathematics.md – Bayesian risk scoring and expected loss minimisation formulas
-
design.md – architectural decisions and trade‑offs