Friday, October 13, 2017

Exchange post-reboot emailed status confirmation script

10/13/2017: If you've got a lot of servers to keep track of - whether doing maintenance (patching), or experiencing the inevitable random reboots/unscheduled-virtualization-burps/what-have-you - you want to know the full status of the system, as quickly as possible IMMEDIATELY after it comes back online. To that purpose, this script is the current rev of a series I've been using for the niche since 2007 or before.

In this case, it's the Exchange-specific version. I generally configure it as a Scheduled Task triggered by the OnBoot event (with a fire-delay for Win2012R2+). The script logs a transcripted file & emails an html-format report, covering the following tests & status:
  1. Runs an automatic pause/delay when run on sub Win2012R2 OS's (to permit time for services to startup and come online naturally). 
  2. Locates a suitable local HubTransport role box to handle smtp report submissions. 
  3. Collects & reports on the state of key Exchange (2010) services by role.
  4. If it finds a key service in a non-Running state, it then makes up to 20 attempts (configurable) to start each failed service.
  5. In the case where it does attempt a service manual start, it then re-confirms post-status on the services. 
  6. It then runs a stock Test-ServiceHealth (redundant with the above, but more descriptive of role-specific status, with it's handy divided status layout). 
  7. For mailbox servers it then:
    1. Runs & reports on Test-ReplicationHealth
    2. Runs & reports on Get-MailboxDatabaseCopyStatus
    3. Collects a list of all suitable hubtransport servers and checks for mail queued to maildatabase's: In my case, they're named in a matchable pattern, so I just go after the names.
      It's also a simple prospect to collect them via something like...
      get-exchangeserver |?{$_.istransportserver -and $_.site -eq 'Sitename'}
      ...as well.
    4. Runs & reports on a pass of Test-MAPIConnectivity against a random mailbox in each database in the DAG.
  8. And for Transport servers, it runs & reports on the status of all queues (get-queue) on the box. 
It's posted pretty much exactly the way I use it (with servernames etc subbed out). So it's not particularly 'genericized' for ready public use. But it does run through a testing protocol, and the assembly of html formatted reports. Even if you can't use it as it sits, it provides some good examples on the process.

Also, I probably should add a block for CAS server reboots, to report on user-utilization levels for the various CAS IIS AppPool's etc. But that's something for another day.  :^D

This is a 900+line script, so I'm not going bore you displaying the full source.:P  But here's a Gist with the data-gathering bits demo'd...
The full code & current revision can always be found at Github: check-ExchSvcStatusRpt.ps1

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.