Update to the April 27 Downtime The downtime that was scheduled on April 27 had some problems as most users are probably already aware. What follows is a summary of the problems and what is being done to prevent them in the future: o The program "lynx" has stopped working because of several security patches applied to the IBM machines. We are in the process of evaluating this and trying to find a work around. o The system incoming mail spool filled up causing problems with the delivery of mail. This problem started sometime Saturday morning. This was caused by an error in a system script which has since been corrected. Some mail may have been lost or returned to the sender as a result of this problem from Saturday morning until Sunday morning. o HP delayed the startup of their tasks (see related system news article "970427_Down" for a list of the tasks) until 2:00 PM on Sunday, which should have allowed enough time if everything had gone as planned; unfortunately it did not. At least one of the scripts that needed to be applied failed and caused problems in bringing the machines back up. The vendor is being asked to allocate more time for their tasks, allowing for any unforseen problems, and to keep them within the originally scheduled times. o The systems were finally brought up by 10:30 PM, but sendmail either didn't come up correctly or failed soon after our systems staff went home for the night and was finally brought back up about 7:00 AM on April 28, 1997. Procedures to rectify further elongated outages are being developed. o Some of the problems encountered are being attributed to a lack of proper communications between the various groups involved. This is being worked on before any more scheduled downtimes will be scheduled. The procedures used to plan and execute these downtimes are under review to find ways of improving them and lessening the impact on the users. We are already finding that the use of redundant hardware is meaningless when the primary system undergoing maintenance is the one item which is not redundant (the RAID disk controller for example).