Single-tenant Artifactory servers returning 500 errors
Incident Report for Artifactory Cloud
Postmortem

This issue was caused by timeouts between Artifactory and Access due to intermittent load on the access server. This was also compounded by a separate issue where the server did not recover from those timeouts as it should have.

During the incident, we made configuration changes that helped to mitigate the impact of those problems. We also tuned our alerting so that our teams could respond immediately when proactive hallmarks for this issue arose.

As a long-term solution, we have created a patch that allows Artifactory to recover automatically from those timeouts with no impact to the service. In addition to the patch, we've added additional logging to assist in future debugging should similar errors arise.

Posted 10 months ago. Nov 22, 2018 - 22:21 UTC

Resolved
This incident has been resolved. We have been monitoring and we have not seen this issue reoccur.
Posted 10 months ago. Nov 21, 2018 - 00:55 UTC
Update
We are continuing to monitor for any further issues.
Posted 10 months ago. Nov 08, 2018 - 16:19 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted 10 months ago. Nov 08, 2018 - 13:43 UTC
Update
We are continuing to work on a fix for this issue.
Posted 10 months ago. Nov 08, 2018 - 06:00 UTC
Identified
We have idenitifed the issue.
Posted 10 months ago. Nov 07, 2018 - 22:58 UTC
Update
We are still investigating this issue.
Posted 10 months ago. Nov 07, 2018 - 22:18 UTC
Investigating
We have identified an issue with Single-tenant Artifactories failing with 500 errors.
Posted 10 months ago. Nov 07, 2018 - 20:52 UTC
This incident affected: AWS US East 1 (N. Virginia) and AWS Europe West (Ireland).