header image
 

The importance of regular server status reporting for unexpected events

The following scenario highlights the importance of having regular server status reporting in place for unexpected events.

For a couple of months at irregular intervals we had users complain that some mails (not all) they’d been expecting from external users were arriving days later or were not arriving at all.

I’d been examining the headers of delayed e-mails when they did arrive and had found that external mail servers delivering e-mails to ours were taking a number of hours or days to do so. Most of the delayed e-mails “seemed” to originate from Eircom (an ISP based in Ireland) so I chalked the problem down to an issue on Eircoms side. Our Security team agreed. This was an oversight on my part however which I’ll explain later…

As I examined more closely I was able to pull more delayed e-mails from external users whose e-mail were not routed through Eircom so logically this meant the problem wasn’t specifically with Eircom.

As a test I had some of the users that reported delays e-mail my employers e-mail address whilst CCing in my personal gmail address. Eventually the issue reproduced itself.

Mails arrived in my Gmail Inbox within a couple of minutes whilst e-mail destined for my employers e-mail address arrived a number of hours later. In the end I was able to reproduce the issue with the help of 4 seperate senders.

Convincing the Security team was a problem however as everything looked fine on our end. They had a hard time accepting the issue was on our side even though I could reproduce the problem with four seperate senders. Four seperate external mail servers behaving in the same way at the same time? and the problem is not on our end?

They needed more evidence…

So the next port of call was Eircom – They were able to provide the following logs. IP addresses and e-mail addresses removed:

2010-07-13 17:22:19.744009500 starting delivery 2146375: msg 3317173 to remote myname@mycompany.ie
2010-07-13 17:28:59.581941500 starting delivery 2147602: msg 3317173 to remote myname@mycompany.ie
2010-07-13 17:49:00.077173500 starting delivery 2150674: msg 3317173 to remote myname@mycompany.ie
2010-07-13 18:22:19.009197500 starting delivery 2158998: msg 3317173 to remote myname@mycompany.ie
2010-07-13 19:08:59.078851500 starting delivery 2171856: msg 3317173 to remote myname@mycompany.ie
2010-07-13 20:08:59.270208500 starting delivery 2188502: msg 3317173 to remote myname@mycompany.ie
2010-07-13 21:22:20.122931500 starting delivery 2203875: msg 3317173 to remote myname@mycompany.ie
2010-07-13 22:48:59.138197500 starting delivery 2224308: msg 3317173 to remote myname@mycompany.ie
2010-07-14 00:28:59.064738500 starting delivery 2244997: msg 3317173 to remote myname@mycompany.ie
2010-07-14 02:22:19.239298500 starting delivery 2261581: msg 3317173 to remote myname@mycompany.ie
2010-07-14 04:28:59.989719500 starting delivery 2279211: msg 3317173 to remote myname@mycompany.ie
2010-07-14 06:48:59.004682500 starting delivery 2299239: msg 3317173 to remote myname@mycompany.ie
2010-07-14 09:22:19.021063500 starting delivery 2342483: msg 3317173 to remote myname@mycompany.ie
2010-07-14 12:08:59.033052500 starting delivery 2428140: msg 3317173 to remote myname@mycompany.ie
2010-07-13 17:22:19.744009500 starting delivery 2146375: msg 3317173 to remote myname@mycompany.ie

2010-07-13 17:24:09.861329500 delivery 2146375: deferral: Connected_to_<IPAddress>_but_sender_was_rejected./Remote_host_said:_451_#4.1.8_Domain

_of_sender_address_<3rdparty@theircompany.ie>_does_not_resolve/

From the above you can see that Eircoms gateway had attempted to deliver the mail in question numerous times over two days but was being rejected because the senders e-mail address could not be resolved.

So I asked the Security team to investigate if any work had been carried out on our DNS infrastructure (or any failures) on the 13th that would prevent the Ironport performing a DNS lookup.

They found that one of the DNS servers which our Ironport server was actively using for sender verification was restarting intermittently causing our Ironport to drop connection attempts when the external mail server attempted delivery. 

They didn’t have status reporting in place for the DNS servers to report any unexpected events.

Moral of the story – don’t rely on e-mail headers to judge mail delivery attempts (they only indicate successful connections) and make sure you have status reporting in place.

~ by Martin on June 14, 2011.

Security, Troubleshooting

6 Responses to “The importance of regular server status reporting for unexpected events”

  1. Have you considered including several social bookmarking buttons to these sites. At least for myspace.

  2. Im getting a tiny issue. I cant get my reader to pick up your feed, Im using yahoo reader by the way.

  3. Hi superb website! Does running a blog like this require a large amount of work? I have absolutely no understanding of coding however I was hoping to start my own blog soon. Anyways, should you have any ideas or tips for new blog owners please share. I understand this is off topic however I simply needed to ask. Thanks a lot!

  4. many thanks

  5. Merely wanna remark on few general things, The website style and design is perfect, the subject matter is real superb : D.

  6. I’ve been surfing online more than 3 hours these days, yet I never discovered any attention-grabbing article like yours. It is pretty worth enough for me. In my view, if all web owners and bloggers made just right content material as you did, the net will be a lot more helpful than ever before.

Leave a Reply




 
%d bloggers like this: