We experienced a service disruption that affected our API and portal services. The incident was caused by database performance issues during routine maintenance, resulting in degraded service for our users.Timeline
- 9:16 AM - Routine database maintenance initiated
- 9:25 AM - Performance degradation detected; investigation began
- 9:29 AM - API services temporarily suspended to allow systems to recover
- 9:50 AM - Partial service restoration began
- 10:15 AM - Rate limiting implemented to ensure stable recovery
- 10:23 AM - Full service restored; all systems operational
Impact
During this incident:
- API responses experienced significant delays
- Some API endpoints were temporarily unavailable
- Portal functionality was degraded or unavailable
- Charging system was not affected — OCPP and chargepoint manager were operating normally.
Root Cause
The incident occurred during routine database maintenance when a schema update on a frequently accessed data table caused unexpected performance impacts. This led to cascading effects throughout our database infrastructure.Preventive Measures
We are implementing the following improvements to prevent similar incidents:
- Enhanced Monitoring - Implementing more granular performance monitoring to better predict the impact of maintenance operations
- Improved Maintenance Procedures - Establishing stricter review processes for database changes, particularly for high-traffic components
- Application Resilience - Improving our application's ability to handle traffic surges during recovery scenarios