ZenML Pro Tenant Outage

Incident Report for ZenML Public Services

Postmortem

We experienced a temporary problem with some EC2 nodes on our EKS cluster, causing pod failures. This lasted around 50 minutes starting at 04:44 UTC. We added new nodes and migrated all our ZenML Pro tenants, after which they resumed normal operations.

Posted Oct 18, 2024 - 07:17 UTC

Resolved

Many ZenML Pro tenants were non-accessible today for a period of 50 minutes (04:44 to 05:34 UTC).
Posted Oct 18, 2024 - 04:30 UTC