Delays in processing charging events

Incident Report for Zaptec

Postmortem

Before the incident report team was in session at around 22:00 CEST, the systems had started to experience an ever-increasing latency, leading to users potentially experiencing delays when trying to control their charging stations. When the incident report team was set into session, the latency had reached 5 minutes or more for most, if not all.

In the hours that followed, the team worked through numerous hypotheses and made several attempts to reduce the load, primarily by limiting retries and restarting applications. Despite these efforts, the root cause of the issue has not yet been fully identified, and we are still actively investigating it. Various code changes were prepared for deployment, aimed at improving several calls believed to function like multicast processes, such as duplicate occurrences of balancing etc.

Unfortunately, these attempts were not deemed successful in reducing or eliminating the issue. However, the problem resolved itself at 00:00 CEST, as a natural dip in traffic pulled the overall load under an unknown threshold, stabilizing the system. At around 01:00 CEST, the response team deemed the system to have recovered, and further improvement attempts were paused. The situation was monitored for the next half hour, ending the response team’s efforts at 01:30 CEST.

In the following days, the team will prioritize improving observability, redesigning traffic flows, and addressing how our systems and surrounding consumer services handle traffic. While the situation has stabilized, we are committed to identifying the root cause and ensuring long-term stability.

Posted Oct 24, 2024 - 12:08 CEST

Resolved

The incident has been resolved. We will continue to monitor this internally.

Post mortem further detailing the incident will be published.

Posted Oct 16, 2024 - 07:27 CEST

Update

We are continuing to investigate this issue.

Posted Oct 15, 2024 - 22:18 CEST

Update

We are still seeing delays in processing charging events, and a incident response team will keep investigating the issue during the night.
You might see that the Zaptec App/Portal takes longer time than expected to update. There might also be delays when authorizing a charging session with app or RFID tag.
If you experience this, you can activate Stand Alone mode on your charging session through Zaptec App or Portal.

Next update is expected tomorrow morning (16.10.2024) at 08.00.

we apologize for the inconvenience.

Posted Oct 15, 2024 - 22:15 CEST

Update

We are continuing to investigate this issue.

Posted Oct 15, 2024 - 21:00 CEST

Investigating

We're seeing delays in processing charging events. We are investigating the issue.

Posted Oct 15, 2024 - 18:15 CEST

This incident affected: Zaptec Cloud Services (OCPP) and Charge authorization.