Office 365, ADFS, and SQL

He’s an issue I’ve just run into that there doesn’t seem to be a good answer to on the Internet. When you are building a highly available ADFS farm to enable Single Sign On for Office 365, should you use the Windows Integrated Database (WID) that comes with Windows Server or store the ADFS Configuration on a SQL server? Put simply, if you are using ADFS *only* for Office 365, there is no need to use SQL server for storing the configuration database. Here’s why.

What Does ADFS with SQL Get You?

I spent a good day or so trying to answer this question because I came in after the initial configuration of a Hybrid Exchange migration to Office 365 that was set up using SQL to store ADFS. When we went to test failover, we ran into a *mess* of problems with this configuration. So I’m going to try to explain things in a way that makes sense, rather than just telling you what Technet says in their article on the subject.

Storing your ADFS configuration in a SQL database gives you 4 things:

  1. The ability to detect and block SAML and WS-Federation Token Replay attempts.
  2. The ability to use SAML artifact resolution
  3. The ability to use SQL fail-over to increase availability of the Configuration database
  4. Scalability!!!

Using WID with ADFS prevents these features from being available to you. Here’s what each of those lines of text mean (Why no, I’m not going to just give you that without definitions like Technet does. Can you tell I’m a little grumpy about Technet’s coverage of this subject?)

SAML and WS-Federation Token Replay Attempts

This is actually a potential security vulnerability that comes about because of how Token authentication mechanisms work. In some cases, an application that accepts a federated token identity for access may allow users to “replay” a previously issued authentication Token to bypass the authentication mechanisms used to issue that Token. You can do this by ending a session with a federated application, then returning to the application with a cached version of the interaction. The easiest way to do this is to just push the Back button in a web browser. If the federated application isn’t properly designed, it will accept the cached version of the token used in the previous session and continue operating just like someone never logged out.

It should be noted that this is a serious security issue. If you have an environment where there are systems exposed to public use by multiple users (Kiosks, for example), you will want to make sure that Token Replay attempts are dropped. But here’s the thing, Token Replay is a vulnerability that affects the application you are using federated login to access. It is not a vulnerability of ADFS itself. Basically, if the applications that you are using ADFS to federate with are designed properly, there is no need to have your ADFS environment provide this kind of protection. Office 365 was designed to prevent Token Replay attempts from succeeding. So if you are only federating with Office 365, you don’t need to have this functionality in your ADFS environment. So, no need to have a SQL based Configuration Database for this reason.

SAML Artifact Resolution

I have tried for a while to find a good definition for what SAML Artifact Resolution actually is. I’m still not 100% on it, because the technical documentation for the SAML 2.0 standard is about as informative as a block of graphite. However, it seems like this is essentially a method through which both sides of a SAML token exchange can check on and authenticate one another using a single session, rather than multiple sessions. This comes in handy for situations where load balancing is in use and the application requests verification of the token provider.

The good news is that you don’t have to understand SAML Artifact Resolution. Office 365 doesn’t use it, so you don’t have to worry about having it. No SQL required for this purpose.

SQL Failover Capabilities

At first glance, having SQL server failover capabilities for your ADFS configuration database sounds like a great idea. I mean, SQL server has great failover capabilities built in. The problem is, so does ADFS. When you use WID for your ADFS Configuration database, the first server in the Farm will store a read/write enabled copy of the configuration database. Each additional ADFS server added to the farm will pull a copy of this database and store it in a read only format. So in a WID based ADFS farm, every ADFS server in the farm has a copy of the configuration database by default. ADFS Proxy servers don’t store or need access to the configuration database, so only the backend ADFS servers will benefit at all from SQL integration.

But here’s the thing, for SQL integration to work properly in a way that allows full SQL failover capabilities, you need to have a valid SQL cluster. With SQL 2008R2 and earlier, this means that you have to have at minimum 2 SQL servers installed on a version of Windows that has Cluster Services available. Cluster services requires that you have shared storage before it will create a failover cluster for SQL versions prior to 2012. If you have 2012, you can manage much simpler failover, but if you’re limited to SQL 2008R2 and earlier, you’re going to end up with a significantly higher infrastructure requirement to get SQL failover working. It you can use SQL 2012 to store the ADFS database, there is some advantage to using SQL clustering for ADFS, but the advantage doesn’t actually justify the costs of doing so.

“What about SQL Mirroring?” You ask. Well, that’s a good idea in theory. You can have your ADFS config database mirrored on two SQL servers, but you will run into some major headaches using this technique, especially if you ever need to run a failover. First off, ADFS configuration with SQL server requires that you use the SFConfig command line tool to create the config database and add servers to the farm. This is a huge pain to do, but it’s necessary. Second, if you ever have to fail over to the other SQL server for any reason, you have to reconfigure every one of the servers in the farm. SQL mirroring doesn’t utilize a clustered VIP, so the configuration with SFConfig requires you to point the server to the active SQL server. When you activate the SQL database on the other server, things may or may not work without reconfiguration. If they do work, there’s a chance that they might *stop* working at any time in the future without notice until you run SFConfig on each member of the farm and use the now active mirror’s address in the command. Then you may have to log in to each of your Proxy servers and run the Proxy Config wizard to reinitialize the proxy trust between it and the ADFS farm. That means that if you want to fail over with SQL mirroring, you have to log in to 4 servers at minimum and reconfigure each one every time the active SQL mirror changes. This is a truly unwieldy and excessive failover process. It isn’t worth the loss of expense from not using a full SQL cluster to do this for ADFS’s configuration database.

The failover abilities of SQL allow you to have constantly available access to a writable copy of the ADFS configuration database. If you use WID and the first server in the farm goes down, you cannot make changes to your ADFS configuration. No adding new servers, no modifications to the ADFS functions, etc. The good news is that the only time you need access to a writable copy of the database is when you are adding new servers to the farm or making modifications to ADFS functions. If you’re using Office 365, you almost never have to make changes to either of these once its set up and running. ADFS will continue operating as long as even one server in the farm is accessible without a writable copy of the database. So realistically, you don’t need SQL failover capabilities for ADFS’s configuration database.

Scalability!!!

Scalability is one of my favorite buzz words. Scalability means “I can add more servers to handle this workload without any problems.” And SQL allows you to do that. However, WID integrated ADFS allows you to do so as well. The limitation, though, is that you can only have up to 5 servers in a WID based ADFS farm (NOTE: Proxy servers don’t count against this maximum number). “Only five servers!?” you may ask in heated exasperation. “Yes!” I answer. “Well, I can’t stand limitations! I will use SQL!” you respond. “You will never need more than 4 ADFS servers in ever!” I retort. ADFS is one of the most light-weight roles available for Windows server. I have deployed ADFS on numerous environments with user counts well past the 10,000 mark and the ADFS servers almost never measure a blip on the performance monitors. This is because ADFS does three things, it provides a website for users to log in to, it queries Active Directory to authenticate users, and it generates a small token that contains less than 1KB of data that it then sends to a remote service. This entire process requires about 10 seconds per user per session, and that includes about 9 seconds of someone typing in their credentials. So there isn’t really even much need to have load balancing capabilities on ADFS. One server can handle monstrous loads. If you have an environment with 100,000 users accessing numerous federated applications, you might need to add 1 or 2 extra servers to your farm to handle the load, but I doubt it. You’re better off just adding RAM and processing power to your ADFS servers to improve performance than adding extra servers to the farm.

If you want multi-site high availability, you may need 5 servers, but probably not more. Let’s say you want to have site resiliency in your ADFS configuration as well as failover capability in your primary and secondary site. This configuration requires a minimum of 4 ADFS servers. Two for the primary site, two for the failover site, and 4 proxy servers that don’t count against the maximum of 5 servers. There you have it, full site-resiliency and in-site failover capability using less than 5 servers. You can even add a third site to the farm without having any issues if you want using the remaining server. But let’s be honest here, if there is an emergency that takes down two geographically dispersed datacenters at the same time, you’re going to be looking for the nearest shotgun to fight off zombies/anarchist bandits, not checking to make sure your ADFS farm is still working, so triple hot-site resiliency for ADFS is probably more than anyone needs. So you don’t need SQL for ADFS if you want scalability. It’s scalable enough without SQL.

EXPLOSIONS! I mean Conclusions

If you are planning to build an ADFS farm for Single Sign On with Office 365, you will ask the question, “Do I need SQL?” The answer to that question, in pretty much every possible case is “Not if you’re just going to use it for Office 365.” SQL integrated ADFS configurations add some features, sure, but there is almost no situation where any of those features must exist for ADFS to work with Office 365. So don’t bother even trying it out. It adds an unnecessary level of complexity, cost, and management difficulties and give no advantages whatsoever.

Advertisements

Should I Switch to Office 365? A Frank Examination

The Cloud – An Explanation

As time moves on, technology moves with it, and times they are a’changin’! There have been many drastic changes in the world of IT over the years, but the most recent change, the move toward cloud computing, is probably one of the most drastic and industry redefining change to occur since the release of HTTP in the early 1990s.

Cloud computing is, put simply, placing your IT infrastructure into the hands of a third party, and it’s becoming big news for companies like Microsoft, Amazon, and Apple, who are working hard to push the IT world into the Cloud so they can take advantage of the recurring income model that their Cloud systems are built around.

A more complex explanation of Cloud Computing almost always requires a metaphor or analogy of some kind, so here’s mine; Cake. I love cake. Everyone loves cake. If you don’t love cake, you’re crazy and you should give your cake to me. But there is a problem with cake. If you live alone, you can never really have cake, because in order to have cake you usually have to buy or make a whole cake, and if you have a whole cake to yourself, you will very quickly regret having purchased a whole cake for yourself as you roll yourself out of bed and out the door each morning on your gigantic rolls of fat. Then you’ll have a heart attack. Cake is meant to be shared. One cake is enough to allow 12 people (or more, or less, depending on how much they love cake) to have a comfortable amount of cake. This is good for everyone because they all get cake and aren’t fat because of it. So, how is the Cloud like cake? Well, now that computing technology has advanced to the point where almost every computer system provides more power than is necessary for most tasks in the corporate environment, buying and building a complex IT infrastructure that meets all the needs of a specific company can be extremely expensive, and it’s highly likely that much of that infrastructure is going to be wasted because it is more than is needed.

Virtualization was the first step to addressing the concerns of excessive computing power. It allowed IT departments to combine multiple server roles, securely segregated, on a single physical server. Prior to the advent of Virtualization, secure segregation of server roles required numerous physical servers, which in turn required a lot of space, power, and resources to maintain. Virtualization started shrinking the corporate Datacenters of the world, and the concept of Cloud computing seeks primarily to not only shrink the corporate datacenter, but to centralize it

Cloud computing, like a giant cake, allows multiple corporate environments to share a single, gigantic infrastructure system. Rather than each company having their own segregated and wholly owned infrastructure that is managed, configured, and maintained separately, Cloud solutions like Office 365 seek to provide all the services of a highly integrated and functional environment without requiring a fully dedicated infrastructure for each company that needs or wants that type of functionality. This is accomplished through the use of highly customized versions of products that are normally available to individual corporations for a one time investment in a packaged, much smaller recurring fee system. The cloud provider builds, maintains, and supports the Infrastructure and the cloud user makes use of the system just like it was owned by them. In the case of Office 365, every individual or company that uses the solution is effectively lumped into the same infrastructure system and uses the portion of that system that they need, rather than using a portion of their own system and wasting the portion they don’t use. This is handy because one of the worst things in the world is stale cake. I mean unused IT resources.

Getting to it – What Does Office 365 Offer

Office 365, being Microsoft’s Cloud solutions environment, provides most of the IT services that companies depend on Microsoft for. Specifically, Email and Calendaring (Through Exchange), Collaboration (Through Sharepoint and Exchange), Instant Messaging (Through Lync), and centralized file management and storage through OneDrive (Formally SkyDrive, formally something else. Microsoft changes the names of stuff every week it seems, so we’ll just call this cloud storage). If you wanted to compare the Infrastructure requirements you would need to meet what Office 365 provides for its monthly per-user fee (or a la cart if you don’t want all the services together) you would need the following things in your environment:

Exchange Online:

  • At minimum 3 Exchange 2013 servers configured with DAG
  • A hardware load balancer
  • A high-speed Fiber-channel SAN with about 55GB of storage per user (For normal Mailboxes) and an additional 5GB for each resource mailbox (Rooms, equipment, shared calendars, etc.).
  • An infinitely expanding low speed SAN (For archive mailboxes, for E3 licenses and above)
  • A secure email delivery solution to provide email stubbing (for E3 licenses and above)
  • Spam filtering services or a spam appliance

Sharepoint Online:

  • At minimum 2 Sharepoint 2013 Servers
  • Additional high-speed Fiber-Channel space, up to 50GB per user (again…this is space for OneDrive/SkyDrive/whatever they call it next)
  • More Load Balancing

Lync Online:

  • At minimum 3 Lync 2013 servers

Generic Software Infrastructure

  • Several Windows 2008/2012 Licenses
  • Active Directory (2 DCs minimum)
  • Multi-Tiered SAN infrastructure (with multi-site geo-replication capabilities)
  • A Load Balancer
  • Highly secured Firewalls

Software

  • Office Professional Plus license for each user

Physical Requirements

  • Multiple Physical locations spread across numerous geographical regions.
  • Each Physical Location should have Concrete security walls, entry barriers, full time security staff, man-traps, and multi-factor authentication before admittance

And that’s just for basic service. You would also need several employees dedicated to maintaining the infrastructure and supporting the environment, since it is very difficult to find individual IT personnel who have the skill set or mental constitution necessary to manage such an environment.

In other words, if you were to build the infrastructure necessary to provide the same functionality and level of service available with Office 365, you would need to spend several hundred thousand dollars in hardware, software, and manpower. This also does not take into account Architecting and development costs for setting up the environment, which is typically done by third party companies or contractors.

Normally, individual companies would need to spend this kind of money every time they upgrade their infrastructure to keep up with new features and changes in technology. But with Office 365, upgrades, patching and new services/features are released regularly, with no need to manage a patching system. All in all, using Office 365 can represent a significant cost savings for companies that need high availability, accessibility, scalability, and security in their IT infrastructure.

Why Wouldn’t I Use Office 365

If it costs a whole lot less and provides a lot of great features, why would you not want to use Office 365? The answer here is that cloud solutions are designed to meet the needs of the average environment, not every environment. As the cloud begins to mature (it’s really just an infant right now, so don’t be surprised if it tosses Cheerios across the room every now and then) it will become much more customizable, but for now, there is very little customization that can happen with Office 365. For instance, any Line of Business application you have in your environment that must me installed on an Exchange Server is not usable. Many software providers that require this type of functionality are moving their focus to cloud based solutions now, but things like Blackberry Enterprise Server will not work with Office 365. As I mentioned, most companies are building systems that integrate with Office 365, for instance Research in Motion has teamed with Microsoft to build in support for BES features into Office 365. This support has to be activated, though. But there are several things that simply can’t be done with Office 365. I’ll try to provide some of the limitations here. For a more detailed explanation of what Office 365 *can* do, check out the official service descriptions available from Microsoft.


  1. Microsoft limits email restorations to 14 days. If an email is deleted from a Mailbox, the Deleted Items folder, and then purged from the hidden recoverable items folder, you will only have the ability to recover that email for 14 days after being fully purged. Within the 14 day window, a support request to Microsoft is required to recover the email. Outside that Window, there is no possible way to recover it. I should mention that this is a technical limitation of the Exchange Online service. Exchange Online utilizes a 14 day lagged, 3+ copy DAG configuration. This configuration allows Microsoft to use Circular Logging on their databases to reduce resource usage, and allows an extremely high level of availability. However, the maximum amount of time that any DAG member can be lagged is 14 days. Once a full email purge is fully committed to the lagged database copy, there is no way to recover it.
  2. Office 365 does not allow unauthenticated email relaying. In order to send email to Office 365, you *must* authenticate with a licensed user account. If you have Line of Business applications or equipment that doesn’t support unauthenticated email, consider upgrading your version to one that supports authenticated relaying. If this is not possible, it is necessary to utilize an SMTP server in your network that supports unauthenticated relay, such as PostFix or the IIS based SMTP server that is included with Windows Server.
  3. Exchange Online has strict limits on the amount of email that each mailbox can send. This is to prevent spamming from Office 365’s mail servers and reduce resource overhead. Each mailbox is allowed to send to at most 10,000 recipients per day (a recipient is considered to be a single email address listed in the To:, CC:, or BCC: fields of an email, so a distribution list can be used to count as a single recipient). In addition, each email can be sent to up to 500 recipients and each mailbox can only send up to 30 messages per minute. Microsoft will not increase these limits even if you ask. If you have a business need to exceed these limitations, consider using a cloud based mass-emailing service like MailChimp.
  4. Many of the administrative capabilities and controls that are available with an On-Premise Exchange, SharePoint, or Lync environment are not exposed to Office 365 tenant administrators. If there is a control that you would like to enable or use and you can’t find it in the Admin Portal, you may be able to do it in the Remote Powershell sessions provided by Microsoft. Even with Powershell, there is only so much you can do. There are 4 different Modules for Managing Office 365 in Powershell, Exchange Online (with some additional things), Lync Online, Windows Azure AD, and SharePoint Online. As a general rule, though, administrative settings that control server level or organizational level functions will not be available to you. If you have a business need to change a high level configuration, you must prepare and submit a Support Request to Microsoft through the Office 365 Admin Portal. This is a fairly simple process, but support requests can take a significant amount of time to complete, so be prepared to wait for your changes to apply.
  5. If it breaks, there’s nothing you can do to fix it in most situations. Unless you are using ADFS and Dirsync in your environment, bringing Office 365 services back online if something fails is completely out of your wheelhouse. If you do use ADFS or Dirsync, the only thing you can really troubleshoot and fix is Active Directory Object syncing or Login issues. Everything else false under Microsoft’s SLAs and is their responsibility to fix. Microsoft guarantees a minimum service up-time of 99.9% (or 43 minutes acceptable downtime per month). The SLA documents provide exact details on service credits granted for falling below that level, but if your IT management has decided that a higher level of up-time is required, Office 365 may not be a good solution for your environment.

There are a number of other limitations to Office 365’s service, so many, in fact, that it would be difficult to outline them all with a small book. Because of that, Microsoft (and I) typically recommend a short Proof of Concept (PoC) period before migrating to the service. A PoC will highlight errors in your system configuration that will negatively impact interoperability with Office 365 and make sure that your environment and business needs can be completely met with Office 365.

I Work in IT – How Will This Impact My Job

One of the greatest causes of push-back from moving to the Cloud comes from IT staff. Most IT people are justifiably twitchy when it comes to keeping their jobs. There are a lot of people competing for IT jobs and one of the major selling points of Cloud Services is decreased employment costs. Add that to the fact that the first group to get targeted for layoffs during a recession is the IT department and questions about how this will impact IT workers becomes a greatly valid question. The truth is that while movements to the cloud will decrease the need for dedicated IT staff in most companies, it also increases the need for IT staff at datacenters and consulting firms. Skilled IT people are in short supply, so keeping up with the technology trends and times is very important. Learning about the cloud and understanding it will keep you from being unemployed for extended periods (I speak from experience on this, I promise).

That said, large environments that maintain IT staff will still need to keep a significant portion of their IT workers even if they move to the cloud. Microsoft and other cloud solutions do not provide End User Support so there will always be a need for Help Desk and on-site support staff. In addition, companies that migrate to the cloud will continue to need IT staff that can interface with Cloud Services Support Staff. A single support request with Microsoft should convince most of the difficulty that exists in managing support requests and maintaining lines of communication during outages and system failures. There will also still be a need for Systems and Network Administrators (In fact, with Cloud services Network Administrators may be in even higher demand as Internet Connection Up-time becomes more important). In all reality, on-site IT staff will still be very necessary, but the nature of the job will begin to change as more cloud services become available. Instead of fighting fires and panicking about system failures and inefficiencies, IT staff can focus on developing processes and making non-cloud based services work better. Cloud services make the typical IT employee’s job easier, not less necessary.

What is Available in the Cloud Besides Office 365

The primary principal of Cloud Computing is that the equipment that runs your infrastructure no longer exists in your physical locations. There are a lot of ways this is accomplished. Terms like Private Cloud are bandied about by Marketing teams with wild abandon without a really concrete definition. Ask three different salesmen what a Private Cloud is and you’ll probably get three completely different answers (Or some really blank stares. Salesmen. *eyeroll* Am I right?). But essentially, you’re in the cloud if you have to have a Public Internet connection to reach your resources. If you have a dedicated MPLS-like connection to a datacenter and someone else manages, maintains, and updates those systems for you, this is not operating in the Cloud. Some people will call this a Private Cloud, but the real term used to define this type of relationship is Managed Services (since that’s what that kind of relationship was called well before the term Private Cloud was coined).

At any rate, cloud services can range from Solutions like Office 365 and Google Apps for Business to things like Drop-Box and Imgur. For IT purposes, some additional services that may be useful include Microsoft’s Infrastructure as a Service (IaaS) Azure solution. Azure allows you to create entirely cloud based Virtual Machines in Microsoft’s datacenters that can be acessed from anywhere. Amazon’s AWS provides a number of services for Web Based businesses to perform necessary functions. Google’s Apps for Business provides similar functionality to Office 365 but with (in my opinion) less polish.

Other Considerations

Migrating to the cloud is a difficult decision to make. There are pros and cons just like any other business decision. Take some time to understand what is involved in moving to the cloud and make sure to plan for life after moving if you decide to do so. One of the best recommendations I can make for people considering a move like this is to contact a company that specializes in cloud services (Like Business and Decision North America, the company I work for. How’s that for shameless plugs?). Aside from being able to explain things in greater detail and help plan for moving to the cloud, using a Microsoft Partner to assist in your move will open up options for quick escalations and better communication with Microsoft’s Support teams in the event of problems. Companies that do not have a partner to assist them must deal with Microsoft’s Office 365 support teams themselves, and this can take up to 2 days to receive the first response , depending on their workload and day of the week (never make a support case at 4PM on Friday. Just a friendly tip). If you work with a Microsoft Partner, this response time can be decreased to the amount of time it takes for someone in Redmond to get their butt kicked (metaphorically speaking). All in all, the world is just now beginning to move into the cloud, so now is the time to begin preparing for the inevitable.

Removing Addresses from an Exchange 2007/2010/2013 Server

This is probably a rare issue, but something I’ve come across in my work. Occasionally an Exchange Administrator may need to remove an Email address domain (The part of the email address that comes after the @ sign). For instance, you may be in a situation where a portion of the users in an Exchange environment are migrated to a Cloud based email solution. This can be a little tricky because even if you remove the email address domain from your list of Accepted Domains in Exchange, the addresses may remain on users’ mailboxes. In this post, I’ll explain the process of removing email domains from an Exchange Server in the proper order.

Step 1 – Remove Address Policies that Use the Domain

Before you can actually remove an accepted domain from Exchange, you have to make sure there are no Address Policies that assign email addresses to users that utilize that accepted domain. In Exchange 2007 and 2010, you can do this by opening EMC (Exchange Management Console) and navigating to Organization Configuration>Hub Transport. Clicking the Address Policies tab will allow you to view the address policies in place. You should then remove any policies that define addresses based on the Email Address Domain you want to remove.

In Exchange 2013, you would open the Exchange Admin Center and navigate to Mail Flow>Email address policies, then modify or remove any policies that include the offending Email Address Domain.

Step 2 – Remove the Domain from the list of Accepted Domains

This step is pretty self-explanatory. In this situation we just remove the domain from the list of accepted domains on the Exchange server. This will tell the Exchange server not to accept emails destined for that domain. This can be done from the same location in EMC for Exchange 2007/2010, and from the Mail Flow system in Exchange 2013 by clicking on Accepted Domains, and then right clicking on the domain you want to remove. Selecting delete will remove that domain.

Step 3 – Remove Email Addresses

This part can be a little tricky. Removing the email address policies won’t necessarily remove the email addresses that users have from their accounts, and if those addresses remain you could still end up having mail go places you don’t want it to. Resolving this issue requires some work with PowerShell in the Exchange Management Shell (EMS).

After the Email Domain is removed, open EMS and run the following command:

get-mailbox | where {$_.emailaddresses -like “*domain.com”}

Replace domain.com with whatever domain you’ve removed. This will give you a list of all the users that have one or more email addresses attached to their domain that match the domain you’ve removed. If there are none, you’re done. If there are some mailboxes with the domain attached, you’ll want to run the following script to remove them:

$users = get-mailbox | where{$_.emailaddresses -like “*domain.com”}
foreach ($user in $users)
{
$addresses = (get-mailbox $user.alias).emailaddresses
$fixedaddresses = $addresses | where {$_.proxyaddressstring -notlike “*domain.com”}
set-mailbox $user.alias -emailaddresses $fixedaddresses
}

This will reset the email addresses on the account.

Internal DNS and Exchange Autodiscover

The Issue

By now, anyone who has managed, deployed, or worked with an Exchange 2007 or later environment should be familiar with Autodiscover. If you aren’t yet, I’ll give a short Explanation of what it is and how it works.

Autodiscover is a feature that allows any Mail Client that supports Autodiscover to configure the appropriate server settings for communication so you don’t have to input everything manually. It’s very handy. Unfortunately, you can end up with a lot of headaches related to Autodiscover when you start having to deal with Certificates. The issues you may run into are specifically limited to Exchange Organizations that have a Domain Name that uses a non-public TLD like domain.local, or a public domain name that they don’t actually own and can’t use externally as well. On an unrelated note, this is one of the reasons that Microsoft has started recommending the use of Public domain names for Active Directory domains.

If you have a domain that isn’t publicly useable on your Exchange AD environment, you will run into certificate errors when mail clients use Autodiscover. This becomes particularly problematic when you use Exchange 2013 and try to use HTTPS for Outlook Anywhere. This is because Microsoft is now enforcing certificate validity with Exchange 2013’s Autodiscover features (Note, though, that Outlook Anywhere will be configured to use HTTP only when your Exchange Server certificate is determined to be invalid in Exchange 2013). With Exchange 2007 and 2010, you will get a Certificate error every time you open Outlook. Generally, this error will state that the name on the certificate is not valid.

The Cause

To solve the issue with certificates, you need to configure your environment so it enforces the appropriate action with Autodiscover. By default, Autodiscover will attempt to communicate with a number of URLs based on the Client’s email address (for external users) or domain name (for internal users). It will take the following pattern when checking for Autodiscover services:

1. Autodiscover will attempt to find the Autodiscover configuration XML file at the domain name of the SMTP address used in configuration (because internal domain computers configure themselves automatically by default, this matches the Internal Domain. For example, the first place autodiscover looks is https://domain.com/Autodiscover/autodiscover.xml for external addresses. Change domain.com with domain.local for what Exchange looks for on Internal clients.

2. If the autodiscover record is not found at domain.com/domain.local, the server will attempt to connect to https://autodiscover.domain.com/Autodiscover/Autodiscover.xml (replace domain.com with domain.local for internal). This is why the typical recommendation for having an A Record for Autodiscover in your DNS that points to the mail server exists. In addition, you would need to have autodiscover.domain.com as a SAN on the SSL certificate installed on the Exchange server for it to be valid when attempting to connect to autodiscover using this step.

3. If autodiscover information cannot be found in either of the first two steps, Exchange will attempt to use a Service Locator record in DNS to determine the appropriate location of the configuration files. This record points the Autodiscover service to a specific location for getting the configuration it needs.

Because of the way this works, there is some configuration necessary to get Autodiscover working correctly. This usually involves adding Subject Alternate Names to the SSL certificate you use for your Exchange Server to allow the many host names used to be authenticated with the certificate.

The problem lately, though, is that many Third Party Certificate Authorities that provide SSL certificates are beginning to deny requests for Subject Alternate Names that aren’t publicly available (There are valid security reasons for this that I won’t go in to in this post, but maybe later). As a result, you won’t be able to get a valid SSL certificate that allows domain.local as a SAN. This means that the automated steps Exchange uses for Autodiscover configuration will always fail on an Internal domain with a name that is not publicly accessible or not owned.

The Solution

IMPORTANT NOTE: This particular solution only applies to computers on your network that are *not* added to the domain. Domain-joined computers have a different solution to work with. Please read my article on resetting the Active Directory SCP for resolving Autodiscover issues like this on domain-joined computers.

There are actually two ways to solve the certificate issues, here. The first would be to prevent Outlook from automatically entering a user’s information when they create their profile. This will result in more work for you and your users, so I don’t recommend it. The other solution is to leverage that last step of the Autodiscover configuration search to force it to look at a host name that is listed on the certificate. This is actually fairly simple to do. Follow these steps to configure the Service Locator record in your internal domain.

  1. Open the DNS manager on one of your Domain Controllers.
  2. Expand out the management tree until you can see your Internal Domain’s Forward Lookup Zone. Click on it, and make sure there are no A records for autodiscover.domain.local in the zone.
  3. Once no autodiscover A records exist, right click the Zone name and select Other New Records.
  4. Select Service Location (SRV) from the list.
  5. Enter the settings as shown below:Image
  6. Hit OK to finish adding the record.

Once the SRV record is added to the internal DNS zone, Outlook and other autodiscover clients that attempt to configure themselves with a domain.local SMTP address will work properly without the Certificate errors on all versions of Exchange.

Other Nifty Stuff

There are some additional benefits to utilizing the Service Locator record for Autodiscover rather than an Autodiscover A record, even in your public domain. When you use a SRV record, you can also point public clients to communicate with mail.domain.com or outlook.domain.com, or whatever you have configured your external server name to be. This means you can get away with having a single host name on your SSL certificate, since you wouldn’t need autodiscover.domain.com to get autodiscover working. Since most Third Party CAs charge a bit more for SANs than they do for Single Name SSL certs, you can save a bit of money (for this to work, though, you may need to change your Internal and External Web Services URLs in Exchange to match the name you have configured).

Another Problem the SRV record Fixes

There are also some other issues you may run into that are easily fixed by adding a SRV record. One of the most common is the use of multiple Email Domains in a single Exchange Environment. If you have users that are not assigned a Primary or secondary SMTP address that matches the domain name listed on your SSL certificate, you’ll discover that those users and the rest of your users will not be able to share calendar data between their mailboxes. You can fix this by adding an Autodiscover SRV record to the DNS zone that manages the additional mail domains. For example, you have domain1.com and domain2.com on the same Exchange Server. user@domain1.com can’t see user@domain2.com’s calendar. The fix for this is to add the SRV record to the domain2.com DNS zone and point it to the public host name for domain1.com’s mail server. Once that’s done the services that operate the calendar sharing functions will be properly configured and both users will be able to share calendars.

Exchange 2010 Relaying – How to use it, how to turn it off

Email Relay is one of the more annoying features of email servers. However, there are times it can be pretty useful. It’s annoying because Spammers love to exploit it, and it’s useful because it can allow you to centralize a lot of email services.

What is Email Relay?

Email relay is, quite simple, a feature that allows one email server to use another email server for sending mail. In a relay setup, one SMTP server is configured to relay all the mail it’s trying to send through another email server when the sending email address is not a part of the second server’s organization. In a relay situation, Server 1 will connect to Server 2 and attempt to send an email using SMTP. However, unlike a normal SMTP session where Server 1 sets the recipient as an email address that “belongs” to Server 2, Server 1 tries to send an email to a recipient in a completely different organization. A successful relay basically means that Server 1 can use Server 2, which accepts email for company1.com, to send email to Company2.com.

How is Relay Useful?

Usually, there’s very little, if any, need to use email relay. But there may be situations where you have an application or device that has its own email server solution built in that needs to be able to send email to various recipients. Without the ability to relay, that application or device would need to have wide open access to the Internet in order to send email. This is not always an optimal solution, especially if you already have an email solution in place. It’s simply more secure to have that application or device relay mail through the central email solution.

How is Relay a Pain?

Allowing relay on an email server can cause some major problems, though. The biggest problem is with spammers. Spammers have software that will go to as many public IP addresses as possible, looking for IPs that respond on port 25. If a server responds, the software will attempt to send an email to a recipient by creating a relay session. If the relay session succeeds, that server is tagged as an “Open Relay” and the software will attempt to use that server as a source for loads and loads of Spam. This often results in massive mail queues and the server that is being used to relay mail will often be blacklisted and legitimate mail from that server ends up getting blocked by email systems that use blacklists as a form of spam filtering. In other words, having an open relay can cripple your Email infrastructure in any number of ways.

Relaying with Exchange 2010

By default, Exchange 2010 does not allow relaying. In fact, the last Email server developed by Microsoft that allowed relay by default was Exchange 2003. However, it is possible to configure Exchange 2010 to work as a relay, but you have to be careful with it because you don’t want to turn your Exchange server into an open relay for spammers to use and abuse.

Relaying in Exchange 2010 (and 2007 if you haven’t made the jump to 2010) is accomplished through the use of a simple setting that exists on the Receive Connectors. It’s called Externally Secured Authentication. Unfortunately, MS didn’t do a very good job at explaining what that setting actually does. The setting exists on the Authentication tab of the Receive Connector properties screen in the Exchange Management Console. The image below shows this setting:

The Externally Secured Authentication setting on Receive Connectors basically removes all security blocks and allows complete and total access to that receive connector for any server with an IP address that matches the address range configured in the Network tab on the connector. This means that once that little box is checked, any computer that matches the connector’s IP range can use the Exchange Hub Transport Server as a Relay.

The Externally Secured Authentication setting shuts everything off because it assumes that you are using some other security technique to allow other mail servers to communicate with the connector. So if you have a Unix mail server that has to send mail to your Exchange server, you can secure the communication between the two servers using IPSec or some other technique to prevent snooping and unauthorized access. Exchange assumes that the connection between hosts is secure and therefore allows complete access to all servers that communicate through the connector.

Shutting Down Open Relay in Exchange 2010

If you have Exchange 2010 and discover that your server is an open relay, the cause is usually due to someone having configured Externally Secured Authentication on your Default Receive Connector. The Default Receive Connector in Exchange 2010 is set up to allow communication with all IP addresses. If the Externally Secured Authentication setting is checked on the Default Receive Connector, every IP address is therefore allowed to use your Hub Transport Server to send mail to any recipient. This is not a good thing. Now, it’s possible to have that setting checked on the Default Receive Connector and still close open relay (Alan Hardisty has instructions on his blog here: http://alanhardisty.wordpress.com/2010/07/12/how-to-close-an-open-relay-in-exchange-2007-2010/ ) but in general, there is no reason to enable Externally Secured Authentication on a publicly accessible Receive Connector. So just don’t use that setting on Default Receive Connectors

Configuring a Relay Connector in Exchange 2010

So let’s take a for-instance. Let say you have an application that has to send emails to people who aren’t in your organization. You really want to use a Relay connector to do this. Here’s how you set it up in the EMC:

1. Navigate to Server Configuration -> Hub Transport in the EMC.

2. Click New Receive Connector in the Right side pane.

3. When the Wizard pops up, enter a name for the connector (Relay Connector works fine) and click Next

4. The Local Network Settings screen doesn’t have anything you have to worry about for this, so you can click next to go on or change the settings there to meet your needs.

5. The Remote Network Settings section of the wizard is where we actually secure this relay connector. The image below shows the default screen:

By default, you can see that the entire IPv4 range shows up (I have IPv6 disabled on my email server, on yours it may show the whole IPv6 range here as well). Select all entries that show here and click the red X to remove them. Click Add and enter the IP address of the server you want to allow relay to. Click Next, then New to finish the wizard and create the connector.

6. Once the wizard is done and the connector is created, you should see it in the EMC. Right click the new connector and go to the Permissions tab first. Select Anonymous and Exchange servers (You have to do this to allow Externally Secured Authentication to be a valid selection). You can also check whatever other groups you want as well.

7. Click the Authentication tab and select the Externally Secured Authentication box. Remove all other check marks and click Apply. Click apply and the connector will be set up properly to allow Relay.

8. Click the Network tab and make sure that only the servers you want to relay are listed under “Receive mail from remote servers that have these IP addresses.” If you still have the full IP range listed the server will be an Open Relay at this point.

Note that when you do this, all communications between the server that is sending mail to the Relay server will be in clear text. This means that anyone sniffing traffic between the two mail servers can read the emails with ease. This is usually not a big deal on an Internal network, but you’ll want to make sure there actually *is* an external encryption system going between the two servers to secure the transmission of data.