How Does Exchange Autodiscover Work?

Autodiscover is one of the more annoying features of Exchange since Microsoft reworked the way their Email solution worked in Exchange 2007. All versions since have implemented it and Microsoft may eventually require its use in versions following Exchange 2016. So what is Autodiscover and how does it work?

Some Background

Prior to Exchange 2007, Outlook clients had to be configured manually. In order to do that, you had to know the name of the Exchange server and use it to configure Outlook. Further, if you wanted to use some of the features introduced in Exchange 2003 SP2 and Outlook 2003 (and newer), you had to manually configure a lot of settings that didn’t really make sense. In particular, Outlook Anywhere requires configuration settings that might be a little confusing to the uninitiated. This got even more complicated in larger environments that had numerous Exchange servers but could not yet afford the expense of a load balancer.

The need to manually configure email clients resulted in a lot of administrative overhead, since Exchange admins and Help Desk staff were often required to configure Outlook for users or provide a detailed list of instructions for people to do it themselves. As most IT people are well aware, even the best set of instructions can be broken by some people, and an IT guy was almost always required to spend a lot of time configuring Outlook to talk to Exchange.

Microsoft was not deaf to the cries of the overworked IT people out there, and with Exchange 2007 and Outlook 2007 introduced Autodiscover.

Automation Salvation!

Autodiscover greatly simplifies the process of configuring Outlook to communicate with an Exchange server by automatically determining which Exchange server the user’s Mailbox is on and configuring Outlook to communicate with that server. This makes it much easier for end users to configure Outlook, since the only things they need to know are their email address, AD user name, and password.

Not Complete Salvation, Though

Unfortunately, Autodiscover didn’t completely dispense with the need to get things configured properly. It really only shifted the configuration burden from Users over to the Exchange administrator, since the Exchange environment has to be properly configured to work with Autodiscover. If things aren’t set up properly, Autodiscover will fail annoyingly.

How it Works

In order to make Autodiscover work without user interaction, Microsoft developed a method for telling Outlook where it needed to look for the configuration info it needed. They decided this was most easily accomplished with a few DNS lookups based on the one piece of information that everyone had to put in regardless of their technical know how, the email address. Since they could only rely on getting an email address from users, they knew they’d need to have a default pattern for the lookups, otherwise the client machines would need at least a little configuration before working right. Here’s the pattern they decided on:

  1. Look in Active Directory to see if there is information about Exchange
  2. Look at the root domain of the user’s email Address for configuration info
  3. Look at autodiscover.emaildomain.com for configuration info
  4. Look at the domain’s root DNS to see if any SRV records exist that point to a host that holds configuration info.

Note here that Outlook will only move from one step to the next if it doesn’t find configuration information.

For each step above, Outlook is looking for a specific file or a URL that points it to that file. The file in question is autodiscover.xml. By default, this is kept at https://<exchangeservername>/autodiscover/autodiscover.xml. Each step in the check process will try to find that file and if it’s not there, it moves on. If, by the end of step 4, Outlook finds nothing, you’ll get an error saying that an Encrypted Connection was unavailable, and you’ll probably start tearing your hair out in frustration.

What’s in the File?

Autodiscover.xml is a dynamically generated file written in XML that contains the information Outlook needs to access the mailbox that was entered in the configuration wizard. When Outlook makes a request to Exchange Autodiscover, the following things will happen:

  1. Exchange requests credentials to access the mailbox.
  2. If the credentials are valid, Exchange checks the AD attributes on the mailbox that has the requested Email address.
  3. Exchange determines which server the Mailbox is located on. This information is usually stored in the msExchangeHomeServer attribute on the associated AD account.
  4. Exchange examines its Topology data to determine the best Client Access Server (CAS) to use for access to the mailbox. The Best CAS is determined using the following checks:
    1. Determine AD Site the Mailbox’s Server is located in
    2. Determine if there is a CAS assigned to that AD site
    3. If no CAS is in the site, use Site Topology to determine next closest AD Site.
    4. Step 3 is repeated until a CAS is found.
  5. Exchange returns all necessary configuration data stored in AD for the specific server. The configuration data returned is:
    1. Autodiscover Service Universal Resource Identifier (URI) – This is assigned to each CAS server and is stored by Outlook for future use.
    2. CAS server name
    3. Exchange Web Services URL
    4. Outlook Anywhere Configuration Data, if enabled.
  6. Outlook will take the returned information and punch it into the necessary spots in the user’s profile information.

Necessary Configuration

Because all of this is done automatically, it is imperative that the Exchange server is configured to return the right information. If the information returned to Autodiscover is incorrect, either the mailbox connection will fail or you’ll get a certificate error. To get Autodiscover configured right, parts 5.1, 5.3, and 5.4 of the above process must be set. This can be done with a script, in the Exchange Management Shell, and in the Exchange Management UI (EMC for 2007 and 2010, ECP/EAP for 2013/2016).

Importance of Autodiscover

In the future, you may be forced to use Autodiscover for Outlook configuration. For now, you can absolutely avoid using it by manually configuring each outlook client. However, before doing that, consider the cost of having to touch each and every computer to properly configure Outlook. It can take 5 minutes or more to configure Outlook on one computer using the manual method, and with Exchange 2013 it can take longer as you also are required to input Outlook Anywhere configuration settings, which are more complex than just entering a server name, username, and password. If you multiply that by the number of computers you might have in your environment and add in the time it takes to actually get to the computers, boot them up, and get to the Outlook settings, the time spent configuring Outlook manually starts to add up very quickly. Imagine how much work you’d be stuck with configuring 100 systems!

In contrast, it usually only takes 10 to 20 minutes to configure Autodiscover. When Autodiscover is working properly, all you have to do is tell your users what their email address is and Outlook will do all the work for you. With a little more configuration or some GPO work, you don’t even have to tell them that!

When you start to look at the vast differences in the amount of time you have to spend configuring Outlook, whether or not to use Autodiscover stops being a question of preference and starts being an absolutely necessary part of any efficient Exchange-based IT environment. Learning to configure it properly is, therefore, one of the most important jobs of an Exchange administrator.

Advertisements

Configuring Exchange Autodiscover

As of the release of Outlook 2016, Microsoft has chosen to begin requiring the use of Autodiscover for setting up Outlook clients to communicate with the server. This means that, moving forward, Autodiscover will need to be properly configured.

This page contains some information and some links to other posts I’ve written on the subject of Autodiscover. This page is currently under construction as I write additional posts to assist in configuring and troubleshooting Autodiscover.

Initial Configuration

The initial configuration of Autodiscover requires that you have a Digital Certificate properly installed on your Exchange Server. If you use a Multi-Role configuration (No longer recommended by MS for Exchange versions after 2010), the Certificate should be installed on the CAS server.

Certificate Requirements

The certificate should have a Common Name that matches the name your users will be using to access Exchange. If you want users to use mail.domain.com to access the Exchange server, make sure that is the Common Name when creating the certificate.

The optimal configuration for Exchange also requires that you include autodiscover.domain.com as a Subject Alternate Name (SAN). You should also make sure that there is also an A or CNAME record in DNS to point users to autodiscover.domain.com. SAN certificates can cost significantly more money than a normal certificate, but there are ways to bypass the need for a SAN certificate (See the next section below for more info).

A Wildcard certificate is usable with Exchange, and can serve as a less expensive way to provide support for a large number of URLs. A Wildcard can also be used on other servers that use the same DNS domain as the Exchange server. However, wildcards are technically not as secure as a SAN cert, since they can be used with any URL in the domain. In addition, they do not support Sub-domains.

The certificate you install on Exchange should also be obtained from a reputable Third Party Certificate Authority. The following Certificate Authorities can generate Certificates that are trusted by the majority of web browsers and operating systems:

Comodo PositiveSSL
DigiCert
Entrust
Godaddy
Network Solutions

Also note, when generating your Certificate Signing Request (CSR), you should generate the CSR with a sufficient bit length. Currently, the recommended minimum for CSR generation is 2048 bits. 1024 and lower bit lengths may not be supported by Certificate Authorities.

Exchange Server Configuration

Autodiscover will determine the settings to apply to client machines by reading the Exchange Server configuration. This means the Exchange Service URLs must be properly configured. If they are not configured to use a name that exists on the Certificate in use, Outlook will generate a Certificate Error.

I will write a post on this subject in the future. For now, you can get this information easily from a Google Search.

DNS configuration

There are 2 different URLs Autodiscover will use when searching for configuration information. These URLs are based on the user’s Email Domain (The portion of the email address after the @). For bob@acbrownit.com, the Email Domain is acbrownit.com. The URLs checked automatically are:

domain.com
autodiscover.domain.com

As long as one of the above URLs exists on the Certificate and has an A record or CNAME record in DNS pointing to a CAS server, Autodiscover will work properly. The instructions for this can vary depending on the DNS provider you use.

Other Configurations

There are some situations that may cause autodiscover to fail if the above requirements are all met. The following situations require additional setup and configuration.

Domain Joined Computers

Computers that are part of the same Active Directory Domain as the Exchange server will attempt to reach the Active Directory Service Connection Point (SCP) for Autodiscover before attempting to find autodiscover at the normal URLs listed above. In this situation, you will typically need to configure the SCP to point to one of the URLs on your certificate.

Go to this post to find instructions for configuring the SCP:

Exchange Autodiscover Part 2 – The Active Directory SCP

Single Name Certificates

If you do not want to spend the additional money required to obtain a SAN or Wildcard certificate for Exchange, you can use a Service Locator (SRV) Record in DNS to define the location of autodiscover. A Service Locator Record allows you to define any URL you want for the Autodiscover service, so you can create one to bypass the need for having a SAN or Wildcard certificate.

Go to this post to find instructions for configuring a SRV record:

Internal DNS and Exchange Autodiscover

 

Office 365, ADFS, and SQL

He’s an issue I’ve just run into that there doesn’t seem to be a good answer to on the Internet. When you are building a highly available ADFS farm to enable Single Sign On for Office 365, should you use the Windows Integrated Database (WID) that comes with Windows Server or store the ADFS Configuration on a SQL server? Put simply, if you are using ADFS *only* for Office 365, there is no need to use SQL server for storing the configuration database. Here’s why.

What Does ADFS with SQL Get You?

I spent a good day or so trying to answer this question because I came in after the initial configuration of a Hybrid Exchange migration to Office 365 that was set up using SQL to store ADFS. When we went to test failover, we ran into a *mess* of problems with this configuration. So I’m going to try to explain things in a way that makes sense, rather than just telling you what Technet says in their article on the subject.

Storing your ADFS configuration in a SQL database gives you 4 things:

  1. The ability to detect and block SAML and WS-Federation Token Replay attempts.
  2. The ability to use SAML artifact resolution
  3. The ability to use SQL fail-over to increase availability of the Configuration database
  4. Scalability!!!

Using WID with ADFS prevents these features from being available to you. Here’s what each of those lines of text mean (Why no, I’m not going to just give you that without definitions like Technet does. Can you tell I’m a little grumpy about Technet’s coverage of this subject?)

SAML and WS-Federation Token Replay Attempts

This is actually a potential security vulnerability that comes about because of how Token authentication mechanisms work. In some cases, an application that accepts a federated token identity for access may allow users to “replay” a previously issued authentication Token to bypass the authentication mechanisms used to issue that Token. You can do this by ending a session with a federated application, then returning to the application with a cached version of the interaction. The easiest way to do this is to just push the Back button in a web browser. If the federated application isn’t properly designed, it will accept the cached version of the token used in the previous session and continue operating just like someone never logged out.

It should be noted that this is a serious security issue. If you have an environment where there are systems exposed to public use by multiple users (Kiosks, for example), you will want to make sure that Token Replay attempts are dropped. But here’s the thing, Token Replay is a vulnerability that affects the application you are using federated login to access. It is not a vulnerability of ADFS itself. Basically, if the applications that you are using ADFS to federate with are designed properly, there is no need to have your ADFS environment provide this kind of protection. Office 365 was designed to prevent Token Replay attempts from succeeding. So if you are only federating with Office 365, you don’t need to have this functionality in your ADFS environment. So, no need to have a SQL based Configuration Database for this reason.

SAML Artifact Resolution

I have tried for a while to find a good definition for what SAML Artifact Resolution actually is. I’m still not 100% on it, because the technical documentation for the SAML 2.0 standard is about as informative as a block of graphite. However, it seems like this is essentially a method through which both sides of a SAML token exchange can check on and authenticate one another using a single session, rather than multiple sessions. This comes in handy for situations where load balancing is in use and the application requests verification of the token provider.

The good news is that you don’t have to understand SAML Artifact Resolution. Office 365 doesn’t use it, so you don’t have to worry about having it. No SQL required for this purpose.

SQL Failover Capabilities

At first glance, having SQL server failover capabilities for your ADFS configuration database sounds like a great idea. I mean, SQL server has great failover capabilities built in. The problem is, so does ADFS. When you use WID for your ADFS Configuration database, the first server in the Farm will store a read/write enabled copy of the configuration database. Each additional ADFS server added to the farm will pull a copy of this database and store it in a read only format. So in a WID based ADFS farm, every ADFS server in the farm has a copy of the configuration database by default. ADFS Proxy servers don’t store or need access to the configuration database, so only the backend ADFS servers will benefit at all from SQL integration.

But here’s the thing, for SQL integration to work properly in a way that allows full SQL failover capabilities, you need to have a valid SQL cluster. With SQL 2008R2 and earlier, this means that you have to have at minimum 2 SQL servers installed on a version of Windows that has Cluster Services available. Cluster services requires that you have shared storage before it will create a failover cluster for SQL versions prior to 2012. If you have 2012, you can manage much simpler failover, but if you’re limited to SQL 2008R2 and earlier, you’re going to end up with a significantly higher infrastructure requirement to get SQL failover working. It you can use SQL 2012 to store the ADFS database, there is some advantage to using SQL clustering for ADFS, but the advantage doesn’t actually justify the costs of doing so.

“What about SQL Mirroring?” You ask. Well, that’s a good idea in theory. You can have your ADFS config database mirrored on two SQL servers, but you will run into some major headaches using this technique, especially if you ever need to run a failover. First off, ADFS configuration with SQL server requires that you use the SFConfig command line tool to create the config database and add servers to the farm. This is a huge pain to do, but it’s necessary. Second, if you ever have to fail over to the other SQL server for any reason, you have to reconfigure every one of the servers in the farm. SQL mirroring doesn’t utilize a clustered VIP, so the configuration with SFConfig requires you to point the server to the active SQL server. When you activate the SQL database on the other server, things may or may not work without reconfiguration. If they do work, there’s a chance that they might *stop* working at any time in the future without notice until you run SFConfig on each member of the farm and use the now active mirror’s address in the command. Then you may have to log in to each of your Proxy servers and run the Proxy Config wizard to reinitialize the proxy trust between it and the ADFS farm. That means that if you want to fail over with SQL mirroring, you have to log in to 4 servers at minimum and reconfigure each one every time the active SQL mirror changes. This is a truly unwieldy and excessive failover process. It isn’t worth the loss of expense from not using a full SQL cluster to do this for ADFS’s configuration database.

The failover abilities of SQL allow you to have constantly available access to a writable copy of the ADFS configuration database. If you use WID and the first server in the farm goes down, you cannot make changes to your ADFS configuration. No adding new servers, no modifications to the ADFS functions, etc. The good news is that the only time you need access to a writable copy of the database is when you are adding new servers to the farm or making modifications to ADFS functions. If you’re using Office 365, you almost never have to make changes to either of these once its set up and running. ADFS will continue operating as long as even one server in the farm is accessible without a writable copy of the database. So realistically, you don’t need SQL failover capabilities for ADFS’s configuration database.

Scalability!!!

Scalability is one of my favorite buzz words. Scalability means “I can add more servers to handle this workload without any problems.” And SQL allows you to do that. However, WID integrated ADFS allows you to do so as well. The limitation, though, is that you can only have up to 5 servers in a WID based ADFS farm (NOTE: Proxy servers don’t count against this maximum number). “Only five servers!?” you may ask in heated exasperation. “Yes!” I answer. “Well, I can’t stand limitations! I will use SQL!” you respond. “You will never need more than 4 ADFS servers in ever!” I retort. ADFS is one of the most light-weight roles available for Windows server. I have deployed ADFS on numerous environments with user counts well past the 10,000 mark and the ADFS servers almost never measure a blip on the performance monitors. This is because ADFS does three things, it provides a website for users to log in to, it queries Active Directory to authenticate users, and it generates a small token that contains less than 1KB of data that it then sends to a remote service. This entire process requires about 10 seconds per user per session, and that includes about 9 seconds of someone typing in their credentials. So there isn’t really even much need to have load balancing capabilities on ADFS. One server can handle monstrous loads. If you have an environment with 100,000 users accessing numerous federated applications, you might need to add 1 or 2 extra servers to your farm to handle the load, but I doubt it. You’re better off just adding RAM and processing power to your ADFS servers to improve performance than adding extra servers to the farm.

If you want multi-site high availability, you may need 5 servers, but probably not more. Let’s say you want to have site resiliency in your ADFS configuration as well as failover capability in your primary and secondary site. This configuration requires a minimum of 4 ADFS servers. Two for the primary site, two for the failover site, and 4 proxy servers that don’t count against the maximum of 5 servers. There you have it, full site-resiliency and in-site failover capability using less than 5 servers. You can even add a third site to the farm without having any issues if you want using the remaining server. But let’s be honest here, if there is an emergency that takes down two geographically dispersed datacenters at the same time, you’re going to be looking for the nearest shotgun to fight off zombies/anarchist bandits, not checking to make sure your ADFS farm is still working, so triple hot-site resiliency for ADFS is probably more than anyone needs. So you don’t need SQL for ADFS if you want scalability. It’s scalable enough without SQL.

EXPLOSIONS! I mean Conclusions

If you are planning to build an ADFS farm for Single Sign On with Office 365, you will ask the question, “Do I need SQL?” The answer to that question, in pretty much every possible case is “Not if you’re just going to use it for Office 365.” SQL integrated ADFS configurations add some features, sure, but there is almost no situation where any of those features must exist for ADFS to work with Office 365. So don’t bother even trying it out. It adds an unnecessary level of complexity, cost, and management difficulties and give no advantages whatsoever.