The VP of Operations at a mid-market manufacturer goes on medical leave with two weeks' notice. Her replacement — an internal promotion, competent, experienced — takes the role Monday morning. By Wednesday he is asking the procurement team why Supplier C is still on hold. Nobody knows. The ops lead dealt with a quality incident with that supplier eight months ago, put them on hold, and managed the relationship herself ever since. The hold is in the ERP. The reason is in her head.
This is key-man risk in manufacturing. Not the HR concept — the operational one. The dependency on a single person not just for their skills and judgment, but for the accumulated context of every decision they made that the organisation never recorded.
Key-man risk in operations is the vulnerability that arises when the institutional knowledge required to run a process lives primarily in one person — not in documents, not in systems, not in a record that a replacement can access. In mid-market manufacturing, this is less an edge case than a default state. The VP of Operations knows which supplier agreement has a non-standard payment term that procurement can't see in the ERP. The plant manager knows why the Thursday night escalation path bypasses the standard routing. The procurement lead knows which cost variance last quarter was approved informally and never documented.
When that person is unavailable, the gap is immediate. It shows up as slower decisions, uninformed overrides, and margin erosion that no one can trace to a root cause because the root cause is invisible.
What operations teams actually lose when a key person leaves
The standard diagnosis of key-man risk focuses on skills: who else can do this job? But in manufacturing operations, the skills gap is usually smaller than the context gap. A competent replacement can manage inventory. They cannot manage it with the same context as someone who handled the stockout crisis in Q3, negotiated the current safety stock levels, and approved the transfer rule that governs how stock moves between the two plants.
None of this is in the SOP. It accumulated through decisions — decisions that were made informally, approved verbally, and never logged anywhere the organisation can find.
Succession planning doesn't solve this
Here is the contrarian point: most key-man risk mitigation in operations focuses on people redundancy. Cross-train a backup. Document the SOPs. Build a transition plan. These are useful. They don't address the core vulnerability.
SOPs document how a process should work. They do not document how it actually worked — every exception, override, and edge case, with a named owner and a timestamp. Cross-training transfers the skill to perform a role. It does not transfer the decision history that contextualises the role. A two-week handover covers the major workflows. It does not cover why Supplier A is on hold.
The replacement VP of Operations will know how to do the job. They will not know what the previous VP decided, and why, for the 18 months before they arrived. That gap — the invisible decision history — is where key-man risk actually lives.
| Scenario | Knowledge in one person | Knowledge in decision infrastructure |
|---|---|---|
| Why Supplier A is on hold | Ask the ops lead — if available | Decision record: override, named owner, date, reason |
| Who approved last quarter's cost variance | Check email threads, maybe WhatsApp | Named approver, timestamp, BC write — one lookup |
| New hire context | 3–6 months of shadowing the predecessor | Full decision history accessible from day one |
| Regulatory audit | Reconstruction exercise — days of work | Traceable to named owner, already assembled |
| Ops lead on leave | Informal decisions without context; operations slow | Routing continues to named backup; decision record intact |
What a decision record actually contains
A decision record is not a log file. It is the traceable chain from a signal to the outcome that resolved it — every step documented automatically in the course of normal operations.
In OpsGrid's case, every entry in the decision record traces: what BC data crossed what threshold and when; which decision owner received the signal and under which routing rule; whether they approved, rejected, or overrode the recommendation; what action was written to Business Central; and whether the signal resolved or escalated. The complete chain, assembled without any documentation effort from the people involved.
This distinction matters because it changes the intervention. A knowledge problem is solved by training more people. A record problem is solved by capturing decisions as they happen — not as a documentation exercise, but as a structural property of how operations are governed.
What this means for operations leaders
The COO or Plant Head who understands this distinction asks a different set of questions. Not "who else can do this job?" but "what does our operation know about itself, and where does that knowledge live?"
If the answer is "in the heads of three people," the operational continuity risk is already present. The question is how to convert it from a people dependency into an infrastructure property.
The practical intervention: govern decisions through a system that records them. Not a CRM, not a ticketing system, not a notes field in the ERP. A system where the decision — who saw the signal, who owned it, what they chose, what executed — is the primary output. The audit trail is not an afterthought. It is the point.
Operations run on Dynamics 365 Business Central that deploy OpsGrid build this record automatically. Every inventory signal, every purchase order override, every supplier hold — the decision chain is assembled in Teams, written back to BC, and logged. The VP of Operations who joins on Monday has access to 18 months of operational decisions by Tuesday morning. Not a summary. The actual record.
Supplier A is on hold because of a quality failure on a specific batch in October. The hold was approved by the previous VP of Operations on October 14th. The batch number is in the record. The vendor's response is in the record. The follow-up decision — to extend the hold pending a new quality audit — is in the record. The replacement VP of Operations can read all of this before they pick up the phone.
That is what operational continuity looks like when decisions are governed, not just made.
Frequently asked questions
What does OpsGrid actually log for each decision?
For every operational signal, OpsGrid records: what BC data crossed what threshold and when; which decision owner received the signal; whether they approved, rejected, or overrode it; what action was written to Business Central; and whether the decision resolved the signal or escalated. The record traces the complete chain from signal to outcome, automatically.
How does a decision record differ from documenting SOPs?
SOPs document how a process should work. A decision record documents how it actually worked — every exception, override, and edge case, with a named owner and a timestamp. SOPs require maintenance effort and become stale. A decision record assembles itself in the course of normal operations and reflects what the team actually did, not what the manual says they should do.
How quickly does a new ops hire reach operational context with OpsGrid?
The decision record is available from day one. A new hire can understand why Supplier A is on hold, what the Thursday escalation path means, and what decisions shaped the current inventory position without months of shadowing the person they're replacing. The context that used to require a handover period is in the record.
Does OpsGrid replace succession planning?
No. Succession planning addresses who can perform a role. OpsGrid addresses what that role has decided — the institutional knowledge layer that succession planning doesn't transfer. Both matter. Cross-training transfers skills and judgment. OpsGrid transfers the decision history that contextualises both.