Data Acquisition, Rhea and Internal tool outage
Resolved
May 07 at 05:54pm UTC
The issues with Forums Data Acquisition, Rhea and internal systems have now been resolved.
A post mortem will be conducted however this was down to a failed disk in a older server that lacked redundancy. A project is already ongoing to stabilise the DAL Datacenter by moving to a new rack with enterprise grade hardware and configurations.
Forums Data Acquisition is scraping at a lower rate and we are monitoring the situation with some dependencies having continuing issues. However this is no longer a outage.
Affected services
Updated
May 07 at 11:46am UTC
Progress is being made with backups having been restored for various services.
Data Acquisition for Forums is partially restored, with limited data being acquired since ~06:00UTC, however this is currently not stable and is only running for short periods of time. We are continuing to investigate and resolve these issues.
Affected services
Created
May 06 at 09:27am UTC
We are aware of issues affecting Data Acquisition, Rhea and various internal tools including the DAL Docker Registry, Resfleet and Vault.
Current data in Titan is unaffected however no new data will be ingested for Forums/Blogs. Rhea is also affected and producing errors.
The root cause appears to be a server failure and we are needing to pull backups. The ETA for this is 1 business day.
Affected services