OpenSARA Hero Logo

An open standard proposal for quantifying Agentic AI risk. Adapted from ISO 26262 to move beyond "Safety" into Business Integrity.

About Project OpenSARA

GenAI chatbots produce probabilistic outcomes, and companies manage that uncertainty through direct supervision. But once an AI agent acts on its own, oversight disappears and organizations are suddenly exposed to unforeseen financial, legal, and reputational risks. Therefore, we propose a Systematic Agent Risk Assessment (SARA) to quantify those risks before deployment.

S/C/A Variables

Step 1: Review your AI application against the following three variables

Severity (S)

What's the worst-case impact of an agent failure?

  • S1 Minor: <$1k loss, internal confusion.
  • S2 Major: Brand damage, GDPR, >$10k.
  • S3 Critical: Solvency risk, safety threat.

Certainty (C)

How grounded is the agent's output?

  • C1 Deterministic: Fixed path / Strict RAG.
  • C2 Bounded: Constrained tools / options.
  • C3 Unbounded: Open-ended / Creative.

Autonomy (A)

What checks and balances affect the agent?

  • A0 HITL: Human must approve execution.
  • A1 Slow: Reversible within hours.
  • A2 Auto: None. Fire-and-forget.

ABIL Determination Matrix

Step 2: Use your choices to identify the required Agentic Business Integrity Level (ABIL)

Severity Level (S) Certainty Level (C) Autonomy Level (A)
A0
Human Loop
A1
Slow Loop
A2
Autonomous
S1
Minor
C1
Deterministic
QM QM QM
C2
Bounded
QM QM QM
C3
Unbounded
QM QM ABIL A
S2
Major
C1
Deterministic
ABIL A ABIL B ABIL B
C2
Bounded
ABIL A ABIL B ABIL C
C3
Unbounded
ABIL A ABIL B ABIL C
S3
Critical
C1
Deterministic
ABIL B ABIL C ABIL D
C2
Bounded
ABIL C ABIL C ABIL D
C3
Unbounded
ABIL C ABIL C ABIL D
ABIL A
Low Integrity Risk

A marketing agent generating draft social media posts for human review (S2 Severity, C3 Certainty, A0 Autonomy) results in ABIL-A requirements. Possible mitigation could include constitutional prompting (guiding principles) to ensure the content is appropriate before the human clicks "approve."

ABIL B
Medium Integrity Risk

A procurement bot drafting purchase orders with a 2-hour delay before execution (S2 Severity, C2 Certainty, A1 Autonomy) suggests ABIL-B requirements. Possible mitigation could include dual-verification where a secondary "Critic" model reviews the order for hallucinations during the holding period.

ABIL C
High Integrity Risk

A customer service bot empowered to instantly apply credits to a live billing system (S2 Severity, C2 Certainty, A2 Autonomy) results in ABIL-C risk. Possible mitigation could include using hard-coded Python logic gates (e.g., if credit > creditLimit: abort) rather than relying on the model's own judgment.

ABIL D
Critical Integrity Risk

An autonomous agent managing load balancing for a regional power grid (S3 Severity, C2 Certainty, A2 Autonomy) produces an ABIL-D risk assessment. Possible mitigations could include operating the agent in a fully isolated sandbox where every command is mathematically verified against process constraints before touching critical infrastructure.

Test Examples

Had the following chatbot failures been vetted against OpenSARA, the risks would have been flagged before deployment. The risk of damages will only get worse as companies adopt Agentic AI.

The Air Canada Incident

Failure Analysis

A chatbot hallucinated a refund policy. The tribunal ruled the output legally binding.

S2 Major
A2 Auto
C2 Bounded
OpenSARA Score: ABIL C
Requires deterministic logic check. "If policy inactive, block response."

The $1 Chevy Tahoe

Failure Analysis

Dealership bot tricked into selling a car for $1.00 via negotiation prompt injection.

S2 Major
A2 Auto
C3 Unbounded
OpenSARA Score: ABIL C
Requires non-LLM Price Floor Check. "If price < MSRP*maxDiscount, reject."