What Really Happens During A Temperature Excursion
By Adam Nordby

Training and preparation are the two factors that matter most during an excursion event. The cascade of events that occurs when a unit goes out of range comes with a multitude of critical choices that require planning, perfect timing, and perfect execution.
The phone rings at 2:07 a.m. A pre-alarm has triggered on a walk-in cooler holding investigational product for three active studies. The monitoring system gives the on-call lead (you) a probe number and a temperature. Before you can reach for your keys, the system has already told most of the story. You open the monitoring software and check the graph; all four probes in the unit are trending in the same direction. The rate of climb is steep. The unit recovered briefly 40 minutes earlier and climbed again. This is not a door event. This is likely a mechanical failure. Now, not having seen the unit, you call the refrigeration technician and you call the next in line for backup. Every second of delay compresses the window for product protection, and the technician’s drive time needs to start immediately, not when you get there. You can always cancel the call, as can your backup. If you are going to move an entire refrigeration unit tonight, you likely cannot do that by yourself.
On-site, the easy diagnostics come first. Check the door, check if the compressors are running. If those come up with nothing, sign in to the access log. Your team should be trained to assess airflow, fan operation, and obstruction quickly and decisively, with the door open as briefly as possible, as in seconds. Door closes. Sign out.
QA is notified. The threshold for that call is written into the SOP. There is no judgment call about whether to make it. The notification triggers a parallel track: The on-call team continues stabilizing the situation while QA begins assessing disposition risk based on the affected studies, the product stability profiles, and the cumulative time in excursion.
Operational Decision Points: Move, Hold, And QA Oversight
The hardest decision of the night arrives next: Move the product or hold and protect in place? That decision belongs to QA. There are a few items that drive this decision. How much usable capacity does the secondary unit have, accounting for airflow and not just shelf space? How long will the original unit remain out of range based on current trends and the technician’s arrival window? And is there enough trained staff on-site, right now, to relocate products without compromising chain-of-custody documentation?
If the product is moved, there is a risk that can be missed. Loading thousands of units of product into a new space can push the receiving unit out of range, and now there are two investigations instead of one. A good operations recommendation to QA, as well as proper planning and understanding of what all of your units are capable of, considers that risk explicitly rather than presenting the move as the obvious answer. In this scenario, QA authorizes a partial move of the highest-risk product or all of it to the qualified secondary unit and a hold on the remainder, contingent on the technician’s ETA and the unit’s temperature trend.
At last, the technician has confirmed an evaporator failure, ordered a replacement, and performed a manual defrost to clear the iced coil, and the unit is holding in range temporarily. The night appears to be over. It is not. The investigation has just started. You now need to notify the appropriate parties before you leave the plant. The notification cannot wait. Sponsors expect prompt acknowledgment that an event affecting their product occurred, typically the same day or within the timeframe defined in the quality agreement, followed by an interim update with preliminary impact assessment and a final report when the investigation closes. Concealing or delaying the notification is the fastest way to lose the sponsor relationship and, depending on the agreement, trigger contractual consequences. The stronger play is the easy one, when your team is properly trained. Call them early, tell them what you know, tell them what you do not yet know, and tell them when they will hear from you next.
The cascade of work that follows from here is what I wish more team members understood. The work involved in an excursion and the burden it places on the rest of your team is intense. They need to drop everything and address it no matter how busy they are.
Now you need to begin filling out your report of the event. If you are new to all of this, remember this: "If it isn't written down, it didn't happen." Documentation begins the instant the call is received and includes time of the alarm, time you called the technician, time you called backup, time of arrival, time of technician arrival, what the technician found, what you found and when. Take photographs of the failed component to identify like-for-like replacements, photographs of frosted lines, and everything applicable to the situation. The sponsor will ask for all of it, and the credibility of the information in the investigation depends on whether it was captured in real time. Since your team is properly trained, this won't be a problem. If they aren’t, reconstructing an event from memory after the fact is not defensible, and any sponsor or auditor will know it. Having not documented something, then filling out an investigation or getting on a call with blank look with a sponsor is unprofessional, damaging, and trust is lost. Trust is not rebuilt by a thorough CAPA written two weeks later. Trust is preserved by the operator who can answer, in the moment, exactly what happened, when, and why.
Data Interpretation And Investigational Risk
As a side note, this is a word of caution to those out there using MKT. Inside that investigation, somewhere in the temperature data, MKT either is used or it is not, and this is where some get themselves into trouble. MKT can be treated, wrongly, as a get-out-of-jail-free card. It is a useful tool, but it is not a safety net. It describes the cumulative thermal history of product over a defined window and those factors must be present for MKT to be used. It can sometimes return a reassuring number after an event. When used to release product that experienced an excursion outside labeled storage conditions, MKT masks a real quality risk. Treat it as one input among several, not as a final verdict.
Now you begin root cause analysis that must distinguish a situational failure, this evaporator, this night, from a systemic one. The investigation asks the questions that distinguish between the two: Was preventive maintenance current? When was the last inspection? An excursion that traces back to a missed PM is a very different finding than one that traces to a defect in the component itself, and the corrective action differs accordingly. Did monitoring trends in the prior 30 days show degraded recovery times worth flagging? At this point, checking all probes for anomalies daily comes in handy. If you didn’t you will be asked why the temperature data looks like the system had been struggling and nobody noticed.
Corrective action addresses the failure: replacement of the failed component, recalibration of probes if relevant, updated PM intervals for the affected unit, and — governed by change control — the re-qualification scope appropriate to what was repaired. A like-for-like component replacement on a previously qualified unit may require only targeted re-mapping rather than a full IQ/OQ/PQ. Repeated excursions on the same unit, or any change that affects the unit's qualified operating range, escalate the scope.
Preventive action is broader. Are there other units of the same age and configuration in the facility? Is the PM schedule across the full equipment register risk-tiered correctly? Should alarm thresholds or escalation timing be adjusted based on what this event revealed? The preventive side is where good depots can shine. Good depots fix what broke and prevent the next three failures.
Supply Impact And Consequences
The supply impact is what makes all of this matter. Investigational product is not a commodity. Every dose has a patient assigned to it somewhere downstream. An excursion handled poorly can pull product out of supply at a site that has no replacement scheduled, delay a dosing window for a participant who has already gone through screening, or, in the worst case, force a study amendment. Handled well, the same event becomes a documented deviation with no impact on supply, no impact on timeline, and no impact on the patient. The difference between those two outcomes is rarely the excursion itself. The difference is the preparation that came before and the discipline in the response.
Mechanical systems fail. That is the operating assumption the team should live by, not to be pessimistic or to say that your equipment isn’t the best quality – it’s just reality. The depots that handle excursions cleanly are the ones that rehearsed the failure before it happened. They have current written after-hours response procedures, pre-identified secondary storage with documented spare capacity, trained backup staff who can be on site within a defined window, service contracts with response-time SLAs that match the criticality of the product on hand, a facilities team doing their due diligence with PMs and monitoring temperature data, and a QA function that is reachable, decisive, and informed. None of this can be assembled at 2 a.m. It either exists or it does not.
Anyone can react to an alarm. The depots that protect product reliably are the ones whose preparation makes the response feel almost routine, because the planning was done months earlier. That preparation matters. And it’s hard; being tirelessly prepared is hard. Those who push hard in this industry do it for the same reason: There is a patient at the end of every chain of custody who never sees any of the work that protected their dose. The patient is who all of this is for.
About The Author:
Adam Nordby has 10 years of clinical trial supply operations experience at a global clinical supplies firm, where he served as quality control specialist, facilities supervisor, and facilities manager between 2015 and 2024. As quality control specialist, he oversaw clinical trial materials through receipt, storage, packaging, distribution, returns, and destruction. As facilities supervisor and facilities manager, he held operational responsibility for cold storage equipment across controlled room temperature, refrigerated, frozen, deep frozen, ultra-low, and cryogenics. He led preventive maintenance, validation, calibration, and environmental monitoring programs and served as lead of the site's rapid response team. He led the procurement and fit-up of a multi-million-dollar GMP returns and destruction facility, which passed its very first audit. He holds a Bachelor of Arts in biology with emphasis in health and medical science with a minor in vaccinology from Minnesota State University Moorhead.