Key-Man Risk in Manufacturing Is a Decision Architecture Problem

Q: How quickly does a new ops hire reach operational context with OpsGrid?

The decision record is available from day one — every signal that ran through OpsGrid, who owned it, what they decided, and what executed in BC. A new hire can understand why Supplier A is on hold, what the Thursday escalation path means, and what decisions shaped the current inventory position without months of shadowing the person they're replacing.

The VP of Operations at a mid-market manufacturer goes on medical leave with two weeks' notice. Her replacement — an internal promotion, competent, experienced — takes the role Monday morning. By Wednesday he is asking the procurement team why Supplier C is still on hold. Nobody knows. The ops lead dealt with a quality incident with that supplier eight months ago, put them on hold, and managed the relationship herself ever since. The hold is in the ERP. The reason is in her head.

This is key-man risk in manufacturing. Not the HR concept — the operational one. The dependency on a single person not just for their skills and judgment, but for the accumulated context of every decision they made that the organisation never recorded.

Key-man risk in operations is the vulnerability that arises when the institutional knowledge required to run a process lives primarily in one person — not in documents, not in systems, not in a record that a replacement can access. In mid-market manufacturing, this is less an edge case than a default state. The VP of Operations knows which supplier agreement has a non-standard payment term that procurement can't see in the ERP. The plant manager knows why the Thursday night escalation path bypasses the standard routing. The procurement lead knows which cost variance last quarter was approved informally and never documented.

When that person is unavailable, the gap is immediate. It shows up as slower decisions, uninformed overrides, and margin erosion that no one can trace to a root cause because the root cause is invisible.

What operations teams actually lose when a key person leaves

The standard diagnosis of key-man risk focuses on skills: who else can do this job? But in manufacturing operations, the skills gap is usually smaller than the context gap. A competent replacement can manage inventory. They cannot manage it with the same context as someone who handled the stockout crisis in Q3, negotiated the current safety stock levels, and approved the transfer rule that governs how stock moves between the two plants.

What the replacement doesn't know

Why Supplier A is on hold — and whether the hold should be lifted now. Which cost variance last quarter was intentional. What the Thursday escalation means and when to invoke it. Why inventory at Plant 2 runs higher than the model suggests it should. Who approved the last emergency freight request and what threshold they used.

None of this is in the SOP. It accumulated through decisions — decisions that were made informally, approved verbally, and never logged anywhere the organisation can find.

Succession planning doesn't solve this

Here is the contrarian point: most key-man risk mitigation in operations focuses on people redundancy. Cross-train a backup. Document the SOPs. Build a transition plan. These are useful. They don't address the core vulnerability.

SOPs document how a process should work. They do not document how it actually worked — every exception, override, and edge case, with a named owner and a timestamp. Cross-training transfers the skill to perform a role. It does not transfer the decision history that contextualises the role. A two-week handover covers the major workflows. It does not cover why Supplier A is on hold.

The replacement VP of Operations will know how to do the job. They will not know what the previous VP decided, and why, for the 18 months before they arrived. That gap — the invisible decision history — is where key-man risk actually lives.

Scenario	Knowledge in one person	Knowledge in decision infrastructure
Why Supplier A is on hold	Ask the ops lead — if available	Decision record: override, named owner, date, reason
Who approved last quarter's cost variance	Check email threads, maybe WhatsApp	Named approver, timestamp, BC write — one lookup
New hire context	3–6 months of shadowing the predecessor	Full decision history accessible from day one
Regulatory audit	Reconstruction exercise — days of work	Traceable to named owner, already assembled
Ops lead on leave	Informal decisions without context; operations slow	Routing continues to named backup; decision record intact

What a decision record actually contains

A decision record is not a log file. It is the traceable chain from a signal to the outcome that resolved it — every step documented automatically in the course of normal operations.

In OpsGrid's case, every entry in the decision record traces: what BC data crossed what threshold and when; which decision owner received the signal and under which routing rule; whether they approved, rejected, or overrode the recommendation; what action was written to Business Central; and whether the signal resolved or escalated. The complete chain, assembled without any documentation effort from the people involved.

The insight worth repeating

Your ops team doesn't have a knowledge problem. They have a record problem. The knowledge exists — in the heads of the people who made the decisions. The gap is that it was never captured in a form the organisation can access when those people aren't available.

This distinction matters because it changes the intervention. A knowledge problem is solved by training more people. A record problem is solved by capturing decisions as they happen — not as a documentation exercise, but as a structural property of how operations are governed.

What this means for operations leaders

The COO or Plant Head who understands this distinction asks a different set of questions. Not "who else can do this job?" but "what does our operation know about itself, and where does that knowledge live?"

If the answer is "in the heads of three people," the operational continuity risk is already present. The question is how to convert it from a people dependency into an infrastructure property.

The practical intervention: govern decisions through a system that records them. Not a CRM, not a ticketing system, not a notes field in the ERP. A system where the decision — who saw the signal, who owned it, what they chose, what executed — is the primary output. The audit trail is not an afterthought. It is the point.

Operations run on Dynamics 365 Business Central that deploy OpsGrid build this record automatically. Every inventory signal, every purchase order override, every supplier hold — the decision chain is assembled in Teams, written back to BC, and logged. The VP of Operations who joins on Monday has access to 18 months of operational decisions by Tuesday morning. Not a summary. The actual record.

Supplier A is on hold because of a quality failure on a specific batch in October. The hold was approved by the previous VP of Operations on October 14th. The batch number is in the record. The vendor's response is in the record. The follow-up decision — to extend the hold pending a new quality audit — is in the record. The replacement VP of Operations can read all of this before they pick up the phone.

That is what operational continuity looks like when decisions are governed, not just made.

Frequently asked questions

What does OpsGrid actually log for each decision?

For every operational signal, OpsGrid records: what BC data crossed what threshold and when; which decision owner received the signal; whether they approved, rejected, or overrode it; what action was written to Business Central; and whether the decision resolved the signal or escalated. The record traces the complete chain from signal to outcome, automatically.

How does a decision record differ from documenting SOPs?

SOPs document how a process should work. A decision record documents how it actually worked — every exception, override, and edge case, with a named owner and a timestamp. SOPs require maintenance effort and become stale. A decision record assembles itself in the course of normal operations and reflects what the team actually did, not what the manual says they should do.

How quickly does a new ops hire reach operational context with OpsGrid?

The decision record is available from day one. A new hire can understand why Supplier A is on hold, what the Thursday escalation path means, and what decisions shaped the current inventory position without months of shadowing the person they're replacing. The context that used to require a handover period is in the record.

Does OpsGrid replace succession planning?

No. Succession planning addresses who can perform a role. OpsGrid addresses what that role has decided — the institutional knowledge layer that succession planning doesn't transfer. Both matter. Cross-training transfers skills and judgment. OpsGrid transfers the decision history that contextualises both.

Key-man risk isn't a people problem. It's a decision architecture problem.