Kaitaki: The Preventable Catastrophe

5 min read

Click here for the Baltimore Bridge Article

The Transport Accident Investigation Commission released its final report into the Interislander ferry Kaitaki blackout and loss of propulsion incident, which occurred near Wellington Harbour on 28 January 2023.

The report outlines six major system failures and concludes that a very serious marine casualty was narrowly avoided. For anyone working in vessel operations, engineering, safety management, or asset maintenance, this is not just another incident report. It is a warning.

The Kaitaki lost propulsion near Wellington Harbour with 864 people onboard. The vessel suffered a blackout, drifted for around an hour, and came dangerously close to rocks near Sinclair Head (ironic). A mayday call was made. From an engineering point of view, this is the kind of incident that should make the whole industry stop and pay attention.

This was not a mysterious failure hidden deep inside a complex system. It was not a brand-new fault mode no one had ever seen before. At the centre of the incident was a rubber expansion joint in the high-temperature cooling water system.

When Kaitaki was approximately one nautical mile off Sinclair Head, the starboard shaft generator tripped, causing a blackout. Shortly afterwards, a rubber expansion joint on the port auxiliary engine ruptured. That rupture caused the loss of water from the high-temperature cooling system, which provided cooling to all main and auxiliary engines. Without cooling water pressure, none of the four main engines could be restarted safely, and propulsion could not be restored quickly.

That is a serious engineering chain. A single failed component contributed to the loss of a cooling system that affected the ability to restart the propulsion plant. Once cooling water integrity was lost, the crew were no longer dealing with an isolated machinery fault. They were dealing with a vessel unable to safely bring propulsion back online while drifting near a lee shore.

The most uncomfortable part is the age of the failed component. The rubber expansion joint had been in service for at least five years and was nearly 18 years old when it failed. Interislander’s own Failure Mode and Effects Analysis recommended these parts be replaced after two years of use.

That is where this moves beyond a component failure. This becomes a maintenance visibility failure.

A rubber expansion joint is not glamorous. But in a high-temperature cooling water system, it is safety-critical. It absorbs movement, vibration, thermal expansion and system stress. Over time, rubber deteriorates. Heat, pressure cycles, vibration, coolant chemistry, age and installation condition all matter. These are not lifetime components.

Any engineer knows this. A rubber component in a hot cooling water system should not be treated as “fit until it fails”. It should be treated as a known degradation item with a defined life, a known consequence of failure, and a clear replacement strategy.

The fact that this component was reportedly far beyond its recommended replacement interval is exactly the kind of risk a proper predictive maintenance and asset intelligence system should surface long before the vessel leaves the berth. Not because predictive maintenance is magic. Because this is basic engineering risk management, done properly and continuously.

A modern system should know the component age, service hours, replacement interval, criticality, system dependency and consequence of failure. It should flag when that part moves from “due soon” to “overdue” to “unacceptable operational risk”.

And for something safety-critical, it should not sit quietly in the background. It should be escalated.

This is where many vessels are still stuck. They have planned maintenance systems, spreadsheets, folders, inspections, class surveys and experienced engineers. But those systems often remain passive. They hold information. They do not always interpret it, connect the dots, or tell management that one small overdue item has the potential to remove propulsion in the wrong place at the wrong time.

That is the difference between recording maintenance and managing risk.

The Kaitaki incident also shows that predictive maintenance is not only about live sensor data. People often think predictive maintenance means vibration sensors, temperature trends, oil analysis and machine learning models watching rotating equipment. That is part of it, but it is not the whole picture.

Predictive maintenance should combine component age, replacement interval, failure mode, criticality, system dependency, historical maintenance, manufacturer guidance, inspection findings, operating conditions, alarms, crew notes, engineering judgement and live machinery data.

The value is not in one data source. The value is in connecting them.

In this case, the failure mode was already understood well enough for the Failure Mode and Effects Analysis to recommend a two-year replacement interval. That means the risk was not invisible. It existed somewhere in the system. The problem was that it did not appear to be visible enough, prioritised enough, or acted on early enough.

That is exactly the gap Mariners Log is trying to close.

The report also highlighted the engineering response onboard. The ship’s master and bridge team were found to have responded appropriately and in a structured way. But the engineering response was described by one engineer as “organised chaos”, with different crew members attempting different recovery actions. Some were trying to reset tripped breakers while others were trying to start pumps locally, which required those breakers to be on. A lack of communication further hampered recovery efforts.

That should hit home for anyone who has worked in an engine room during a real failure.

When things go wrong, the machinery space gets noisy, hot, urgent and confusing very quickly. Alarms are going off. People are moving. The bridge wants answers. Passengers may be involved. The vessel may be drifting. Nobody has time to search through old procedures, remember which PDF contains the cooling water failure response, or guess who has already tried what.

This is where decision support becomes critical.

The report noted that a more structured and well-exercised engineering response would likely have resolved the mechanical failure and restored propulsion sooner. It also recommended implementing decision-support systems for vessel engineering departments.

That recommendation matters. Once a fault becomes an emergency, the system should not just tell you something has failed. It should help the engineering team understand what failed, what systems are affected, what actions have been attempted, what procedure applies, and what the safest recovery path is.

In other industries, this is normal. Aircraft engine monitoring, industrial plants, Formula 1, power generation and high-value machinery operations all use data, modelling, alerts and decision support to reduce risk. The maritime industry has the same engineering complexity, asset value and consequences when things go wrong. Yet too often, the tools onboard still look like they belong to another decade.

The final report also identified wider system issues, including deterioration of rubber expansion joints, weaknesses in safety management processes, ageing fleet risk, lack of sufficient towage and salvage capability, mass rescue preparedness issues, and gaps in specialist maritime expertise during incident response.

That matters because major incidents rarely come from one thing.

A rubber joint fails. Cooling water is lost. Engines cannot restart. Communication breaks down. Procedures are unclear or not exercised enough. The vessel is near shore. Weather is poor. Towage options are limited. Rescue planning becomes critical.

It becomes a chain. Predictive maintenance should break that chain early.

For operators, especially in New Zealand, this should be more than another report to read and file away. We operate in challenging waters, with critical ferry routes, ageing assets, tight schedules, limited redundancy and increasing pressure on maintenance budgets. The Cook Strait is not forgiving. A disabled vessel there is not just an engineering problem. It becomes a passenger safety issue, a national transport issue and potentially a major rescue operation.

Older vessels can absolutely be operated safely. But only if the maintenance strategy, risk visibility, decision support and engineering systems evolve with the age and condition of the asset.

You cannot keep adding operational pressure to ageing machinery and expect old systems to manage modern risk.

That is the uncomfortable truth.

The industry does not need more paperwork, more disconnected systems, or another dashboard that looks good in a boardroom but does nothing for the engineer trying to keep the plant online.

It needs systems that understand the vessel. Systems that know which components matter. Systems that identify when risk is building. Systems that escalate safety-critical issues before they become emergencies. Systems that support engineers when time is short and the wrong decision could cost lives.

The Kaitaki incident ended safely, but the margin was thin. The anchors held. Power was eventually restored. Evacuation was avoided. But the report is clear that time was critical, and a very serious marine casualty was narrowly avoided.

That is not a success story. That is a warning.

For the maritime sector, and especially for New Zealand, the question is no longer whether data-driven maintenance and decision support are nice-to-have tools.

The question is whether we are willing to keep managing critical vessels with systems, processes and mindsets that belong in the past.

Is it time for the maritime space, especially New Zealand, to embrace, practical, new technologies?

Author - Ken Sinclair

Mariners Log Engineering

The ultimate super yacht management system

Get started today!