Single Point of Failure (SPOF) Audit

Client Self-Assessment

Purpose

This audit identifies where delivery, reliability, or growth depends too heavily on a single person, decision path, or system. Single points of failure increase execution risk, slow response under pressure, and make scale fragile.

What this is

A fast, practical assessment across Policy, People, and Technology that highlights risk concentration and prioritizes high-leverage fixes.

What this is not

Not a compliance exercise.

Not a tool inventory.

Not a re-organization.


How to Use This Audit

  • Complete the checklist (15–30 minutes)
  • Score each section
  • Review red flags
  • Prioritize fixes starting with the highest score

1. Policy (Governance and Decision Flow)

Check all that apply

Key decisions depend on one person being available

Decision authority is implicit or undocumented

Objections can block progress without alternatives or deadlines

Client commitments exceed internal capacity controls

Healthy signals

  • Clear decision ownership
  • Written escalation paths
  • Time-bound objections
  • Defaults that allow work to proceed
Red flag
If one absence causes decision paralysis, you have a policy single point of failure.

2. People (Knowledge, Ownership, Capacity)

Check all that apply

Only one person can deploy or fix production

One person owns a client relationship end-to-end

Critical knowledge lives in people, not documentation

Senior staff routinely step in to unblock work

Healthy signals

  • At least two owners per critical responsibility
  • Clear primary and secondary ownership
  • Runbooks for recurring work
  • Predictable handoffs
Red flag
“Ask them, they’re the only one who knows” indicates a people single point of failure.

3. Technology (Systems and Infrastructure)

Check all that apply

One admin account or credential controls production

No tested rollback or recovery path

Infrastructure changes are manual

Monitoring alerts go to one person

Healthy signals

  • Infrastructure defined as code
  • Centralized access and secrets
  • Tested recovery paths
  • Group-based alerting
Red flag
“Don’t touch that system” indicates a technology single point of failure.

SPOF Risk Scoring

Score each area from 0 to 2

  • 0 = No meaningful risk
  • 1 = Partial or emerging risk
  • 2 = Clear single point of failure

Record your scores

  • Policy:
  • People:
  • Technology:

Interpretation

  • 0–2 → Healthy
  • 3–4 → Latent risk (address proactively)
  • 5–6 → Active execution risk (address immediately)

What to Fix First (80/20 Guidance)

Start with changes that:

  • Reduce dependency on individuals
  • Clarify decision authority
  • Make recovery boring and repeatable

Common high-leverage fixes:

  • Add a secondary owner
  • Write a one-page runbook
  • Introduce a default decision rule
  • Route alerts to a group
  • Automate one manual step

Write an Executive Summary (Optional)

Our highest execution risk comes from [Policy / People / Technology], specifically [X]. This creates fragility during normal operations and significant risk under stress. Addressing [Y] will materially reduce dependency on individuals and restore predictable execution.

Why this matters

Single points of failure rarely show up during calm periods. They surface under pressure; during incidents, growth, or key absences. This audit helps remove hidden fragility before it turns into outages, missed deadlines, or burnout.

Next step

Use this audit as a baseline and re-run it quarterly or after major organizational or technical changes.

An inviting cafe

Making a decision?

If you're facing a high-stakes decision and want to reduce execution risk before commitments are locked, we can help.

Even when commitments are already in place, we can still help. Assess risk, regain control, and stabilize execution if outcomes aren't matching expectations.