We wanted to reach back out to share more detailed information on the incidents that occurred on 8/7 and 8/8. These incidents were caused by separate issues See the information below for a description of each issue and the steps taken to remedy them.
8/7
Incident Cause
An inefficient database usage pattern change was submitted and deployed on 8/2. Although inefficient, due to the standard load prior to deployment, no regression was detected. We encountered a new peak load on 8/7 which along with this inefficiency led to a large increase in latency (turnaround times) which was the incident faced this day.
Resolution
We identified and reverted the database usage change committed on 8/2 that led to this slowdown.
We upgraded our database instance size.
8/8
Incident Cause
A full table query was run against our write replica database as a team worked to transfer data to BigQuery for business intelligence tooling. This led to database contention and slowed down our production service.
Resolution
We implemented more fine-grained controls and roles for database access along with an approval process to verify production database queries are run against the correct replica and will not impact customers.
If you have any questions about this information feel free to reach out to support@assemblyai.com.