1/28 Update on Email Outage
ECE Community,
At approximately 1:30 p.m., Saturday, January 26th, the ECE mail server began experiencing difficulties processing incoming mail. The root cause appears to have been a failure on the part of the spam and email processing subsystem, amavis. As a result of this failure, incoming email was not delivered to user accounts, and the incoming message queue became congested with unprocessed mail.
Systems staff are normally made aware of failures by email and pager. Unfortunately, the nature of the failure, in part, affected the ability of systems staff to be notified.
Systems staff became aware of the issue at approximately 11 a.m. Sunday, January 27th. At this time, the amavis subsystem was repaired and restarted.
As of the time the amavis subsystem was restarted, the incoming message backlog stood in excess of 25,000 messages. This unusually large backlog overwhelmed the mail server's ability to deliver mail, as tuned. As a result, systems staff re-tuned the mail server to permit processing of this backlog, and to help mitigate delays in normal operation.
As of the current time, the incoming mail backlog has been processed to only 800 messages remaining.
No mail was lost as a result of this incident.
Pratt IT systems staff would like to apologize for the inconvenience stemming from this incident. Staff are working to improve the systems for off-hours notification, and to improve the
reliability, performance and accuracy of our anti-virus and anti-spam solutions.
Best,
Victor J. Orlikowski
Systems Programmer, Pratt Information Technology
