See Amazon's Correction of Errors in action with a full ficticious example. The company noticed a rise in air temperature levels in their X°C and Y°C chillers, which led to a full product recall of certain fresh meat and seafood products. Amazon's prompt response to this issue demonstrates their commitment to customer safety and satisfaction.
The root cause of the problem was identified as a buildup of ice and dust on the chiller blower, which reduced its output and affected the optimal temperature levels for storing fresh products. Amazon has since implemented various corrective measures, such as increasing the nominal capacity of their cold rooms, reducing the heat load, and improving temperature monitoring and escalation procedures.
This incident serves as a valuable lesson for Amazon and other companies in understanding the importance of having a thorough knowledge of their systems, even when outsourcing to third parties. By taking ownership of critical functions and establishing clear escalation paths and SOPs, businesses can ensure that they are better prepared to handle any potential issues that may arise in the future.
Chiller Correction of Errors
On [date], we noticed the air temperature levels in both the X°C and Y°C chillers began to rise. Some air temperature variation is normal, due to many factors including scheduled defrost cycles, where the chillers shut down with the goal of eliminating ice buildup on the unit. Over the next several days, we realized that the chiller in the y chamber would not be able to maintain optimal temperature levels for storing fresh meat and seafood. On [date], we determined that some products had been stored outside the recommended ranges from [date] through [date], and initiated a full product recall of the [product] and [product] products sold and delivered to customers between those dates.
Upon initiating the recall, we immediately took the following actions:
It took x days to get the y chamber back in a normal operating mode. On [date], we started stocking up [products] so we could resume selling [product] products. The y chiller failed again during this first stock-up attempt. We had to waste additional product, though none of that product made it to customers so a second recall was not necessary. Our chilled rooms are now operating normally again. We have put in short-term fixes and are working on longer-term solutions to ensure this problem does not happen again.
The recall directly affected x customers. The direct cost of the recall was approximately ~$x ($x in refunds, $x of goodwill credits towards future purchases ($x credits to x customers), and $x in inventory that was disposed of. Additionally, we were unable to sell much of our chilled range from [date] until the full range was back in place at the end of [date]. We lost potential sales, but more importantly lost customer trust during that timeframe.
For background, the fresh and frozen operation on level was started on [date] operating across 3 temperature regimes x, y and z. The cold rooms were designed and built by [mfr] based on assumptions given to [company] on [date] to calculate the heat load. The chambers are designed to work at x-y, z-q and r. The insulated envelope was constructed by a contractor using standard sandwich panels and an insulated floor.
The operation and maintenance of the system is the responsibility of [company]. The temperature controlled operation is under the umbrella of [organization] license for the building. The chilled chambers are cooled by the [mfr] central ammonia system with a single direct expansion blower in each chamber. Whilst there is a standby compressor there is no backup for the blowers. The central system is operating at capacity and the freezer is supplied from a standalone Freon system. Overall there have been no major issues with the temperature control until [date].
The food quality team has manually recorded the temperature for each of the 3 chambers on an hourly basis since the operation started.
The triggering event that led to the problem was a buildup of ice and dust on the blower that reduced the output of the chiller in the y chamber to the point where it could no longer maintain the desired temperature given the heat load in the room. There were several root causes, most of which stem from an insufficient understanding of, and operational control over the effective chilling capacity, our heat load, and how those two factors interacted with each other. The chiller’s effective output gradually declined due to ice and dust build up. Concurrently, we also had been gradually introducing a higher heat load into the y chamber because we needed more material, people and activity to handle increasing order volumes. Once the heat load passed the effective capacity, we were unable to recover without impacting normal operations.
Why did the chilling system fail?
We have made or are in the process of making the following changes to address the nominal and effective capacity of our cold rooms:
We took the following actions to reduce the heat load:
We have made or are in the process of making the following changes to address the monitoring and escalation of temperature in our chiller rooms:
Most of the lessons below can be applied to many areas in [company] outside of the chiller and operations area.
Here are some things we did well during the event.
See Amazon's Correction of Errors in action with a full ficticious example. The company noticed a rise in air temperature levels in their X°C and Y°C chillers, which led to a full product recall of certain fresh meat and seafood products. Amazon's prompt response to this issue demonstrates their commitment to customer safety and satisfaction.
The root cause of the problem was identified as a buildup of ice and dust on the chiller blower, which reduced its output and affected the optimal temperature levels for storing fresh products. Amazon has since implemented various corrective measures, such as increasing the nominal capacity of their cold rooms, reducing the heat load, and improving temperature monitoring and escalation procedures.
This incident serves as a valuable lesson for Amazon and other companies in understanding the importance of having a thorough knowledge of their systems, even when outsourcing to third parties. By taking ownership of critical functions and establishing clear escalation paths and SOPs, businesses can ensure that they are better prepared to handle any potential issues that may arise in the future.