AMQP service degradation

Incident Report for Easee

Resolved

All servicing upgrades completed. We are now on the latest versions of our message broker and all metrics have normalised for the last 1.5 hours.

Posted Jul 17, 2024 - 13:54 CEST

Monitoring

Service restart completed and messages are flowing through the network.

Posted Jul 17, 2024 - 10:44 CEST

Identified

A couple of issues were identified -
1. Our message broker is not releasing allocated memory in a predictable way. We will allocate more resources at this time while we find an overal strategy/policy to mitigate.
2. We have a lot of internal connections pushing messages to our message broker. Reducing the connections could provide memory relief on the AMQP cluster, but could potentially increase latency. We will monitor any tweaking we perform here.

We are tweaking the service limits and also increasing the compute resources. While we apply the mitigations the performance of the service will be impacted but we expect full recovery within the hour.

Apologies again for any incovenience.

Posted Jul 17, 2024 - 10:22 CEST

Investigating

The issues from the last incident seem to be persistent.
This is resulting in a reduced flow of messages through the network, which is impacting the performance of our operators' applications and services.
We are working to address this issue and will likely issue a service restart.
We apologize for any inconvenience this may cause and appreciate your patience.

Thank you for your understanding.

Posted Jul 17, 2024 - 09:38 CEST

This incident affected: AMQP.