Sunday, February 12, 2012

Troubleshooting The Windows SMTP Server

Why Isn’t My E-mail Getting There?

Email is a very common but hard feature to troubleshoot.
No one gets a hi-five for implementing an email feature. I’ve tried it before, it’s impressively devoid of gratitude.
Email is one of those features that’s just supposed to work and very little thought is often given to how difficult it really is to get right. Even if users didn’t expect it to be instantaneous and scale to legendary sizes, they definitely wouldn’t hand out accolades for all the troubleshooting that email brings to your door.
Email is as complicated as ever, and requires a ton of different systems to be in sync just for a single message to get through. Components like:
  • Application Settings
  • SMTP Server auth/relay settings
  • DNS (appropriate MX, A, and potentially PTR and SPF) both internally and externally
  • Email client and Server server spam filtering (text processing, attachment blocking, black listing, etc…)
…just to name a few, often present issues and ensuring their correct often crosses multiple disciplines and sometimes organizations. The broad domain of knowledge required to diagnose email issues can often cause their troubleshoots to be prolonged.


Fortunately most troubleshoots usually fall into one of three categories:
  1. Application/SMTP Server Issues – These are problems between the application and the SMTP server, usually there’s a stack trace or some kind of very visible error code thrown for the application or in the Event Viewer/SMTP Log files. These are for things like:
    • Unable to authenticate
    • Unable to relay
    • Email is invalid (one of its fields breaks the RFC)
    • etc…there’s a bunch of them, but they they’re usually pretty easy to troubleshoot.
  2. Sending SMTP/The Internet/Destination SMTP – This is where things get a little bananas. A lot of SMTP server’s aren’t great at logging (ahem, all versions of IIS to date), and it’s hard to get feedback on where in all things are going sideways.
This is also where a lot of developers start to feel out of their element. Checking DNS records and relating them to email behavior has traditionally been reserved as a core IT task. This is where tools like SMTPDiag save the day. Weighing in at 79 KB this tool can be a life savor that stops you from needing to memorize telnet “ehlo” commands and other awkward SMTP command syntax.
SMTP Diag runs a variety of tests, like DNS lookups and then goes through the motions of sending (doesn’t actually send) an email for every server that has an MX record. This usually tests:
  • DNS – Both sending and receiving domains have MX records, have A records.
  • Reachability – Both the sending and receiving domains’ email servers are accessible.
  • Accessible - Connects to SMTP servers and goes through the motions of sending an email from/to the given addresses.
Here’s an example of it running in verbose mode:
SmtpDiag.exe “” “” /v
Just today the utility helped me realize that some emails stuck in my Windows 2008 SMTP server queue were really stuck there not because my SMTP server was set up incorrectly, but because my ISP was stopping outgoing port 25 traffic. My only regret is not running the thing earlier in my troubleshoot (I started at my application and SMTP server).
3. Assuming both of the above check out, then you most likely have something on the user’s email client or on the destination email server (read spam filtering). Which also means it’s officially not your fault.
I hope the above helps someone, these troubleshoots can often be a nightmare and there’s nothing like a good tool to help you wake up.
Good Luck,

No comments: