This is a summary of the events that have occurred on the mail server since its migration to a temporary site while the production machine is being upgraded. Some of the events that occurred are still under investigation while others are being attributed to outside influences.
Mail services are moved to a temporary site. The temporary site is running HP-UX 10.20 with Sendmail 8. The old site was running HP-UX 10.10 with sendmail 5. Both machines had the same CPU's and memory configuration. The temporary machine has two 100BaseT cards which require HP-UX 10.20 for support. This is one of several support issues requiring HP-UX 10.20.
ITS personnel notice that the loads on the temporary machine seem higher than they should be and start looking into the problem to determine if it is a local configuration error or the new software itself. The service is exhibiting symptoms similar to those seen when the old mail site, oboe, was delivering mail.
Mail delivery continues to generate a load much greater than on the prior machine (it usually ran a load of 3-5 during peaks). The new machine is running loads of 5-20 during peaks.
AT 4:00 PM the Chancellor's Office holds a video conference over the Frame Relay Network that makes up the backbone of CSUNet. During the video conference, bandwidth is taken from normal virtual circuits to provide the audio/video signals. When the video conference is over the Network is improperly configured and Cal Poly looses all contact with the outside Internet world.
With no contact with outside nameservers, the new version of snedmail shut itself down. During this time no mail was delivered regardless of its targetted address (on or off-campus).
Mail has backed up so much that no mail is effectively being processed. CSUNet is contacted about the network problem and our connection is re-established at 9:00 AM. The rest of the day the temporary machine makes a valiant effort to keep up with the days load and barely starts to catch up with the backlog.
ITS personnel suspects that the delivery problems may be related to a communications process that seems to be getting a large amount of the CPU as well as the Sendmail 8 which appears to generate many more processes than the older Sendmail 5 used to. A patch is obtained from HP for the communication process and it is decided that the patch will be installed during an emergency down the next morning.
The patch is installed with little or no observable affect. The feeling is growing stronger that the problem is more related to Sendmail 8. Late in the day it is decided to take the two CPU's that are being pulled out of the production mail machine and install them into the temporary machine bringing it up to four CPU's. The production machine will be receiving two new PA8000 CPU's which are supposed to be at least 2.4 times faster than the older PA7200 CPU's moved to the temporary machine.
It was later determined by HP that the increase in processing time by the communications process was a normal result of the 100BaseT cards having ten times the rate of the old cards and potentially ten times the number of interrupts.
The CPU's are installed in the early morning. With a special mailing, its observed that the throughput is about the same as it used to be on the production machine under HP-UX 10.10 with Sendmail 5. This is progress since that level of throughput was able to keep up with the mail demands.
Four more software patches were installed on the system as a result of the on-going investigation.
Two 70+ Megabyte messages are received simultaneously which results in a full mail spool and mail processing stops sometime during the evening.
The problem with the mail spool is detected by ITS staff as they arrive at work and dealt with. The backlog of mail is processed quickly and the machine resumes normal processing of mail.
ITS personnel are still working with HP to find a resolution to the drop in performance of mail processing going from Sendmail 5 to Sendmail 8. They will continue to pursue this course until some kind of resolution is reached..
Revised by: George Westlund (gwestlun@calpoly.edu)