ADFS or Password Sync: Which one do you use?

I’ve run into a number of people who get confused about this subject when trying to determine how to get their On-Prem accounts and Office 365 synced and working properly. Most often, people are making a comment somewhere that says, “Just use Password sync, it’s just as good and doesn’t require a server,” or something similar. While I wish this were true, it most absolutely is not. While both options fulfill a similar requirement (“I want my AD usernames and Passwords to work with Office 365”), they both do so in a completely different manner that can have a major impact on security, workflow, and administration of services.

Single Sign-On vs Same Sign-On

To see the difference here, you have to understand the terminology involved. The primary goal for synchronizing user accounts between Office 365 and Active Directory is to give users the ability to use the same username and password to use O365 that they use when logging in to their computer. There are two terms used to describe this relationship. Single Sign-On refers to technology that allows users to access numerous applications while only logging in once. You’ve probably used Facebook or Google’s version of this to access applications, games, or other software. Same Sign-On, however, allows a user to access multiple applications with the same username and password. If you have two bank accounts and use the same username and password to access them, you’re using a simplified version of Same Sign-on. Most Same Sign-on solutions in IT involve an application that reads username and password data used by one system and copies it to another system.

The biggest difference between the two technologies is that Single Sign-On allows you to authenticate one time and access all the applications that are tied to that sign-on system. Same Sign-On requires you to log in to all applications regardless of which or how many applications you’ve already logged into using that username and password.

Single Sign-on and Same Sign-on have a lot of similarities as well. They both allow you to use the same username and password and both simplify account management (theoretically). Most importantly, for Office 365 at least, they allow you to manage usernames and passwords in a single environment, rather than having to change passwords in multiple locations every time something needs to change. The way changes are accomplished is where the decision to use ADFS or Password Sync faces its biggest test.

ADFS is Single Sign-On, Password Sync is Same Sign-On

For the purposes of Office 365, which is what this article focuses on, ADFS is considered a Single Sign-On solution, while Password Sync is Same Sign-On. What does this mean for you, the IT administrator, when you are deciding how to set up your environment? It means you need to consider the following realities of each solution:

ADFS Issues

  1. ADFS requires more administrative overhead to function:
    1. ADFS is not a perfect solution and it does fail sometimes.
    2. Troubleshooting ADFS can be a daunting task. The error messages provided by ADFS are really poorly worded and generic, so a lot of digging in logs is required to really figure out where a problem is coming from.
    3. ADFS requires a trust between your environment and Office 365. Maintaining the trust takes some effort. ADFS relies on Digital Certificates that have expiration dates, so you have to make sure the certificates are updated before they expire or ADFS won’t work.
  2. ADFS is tricky to configure sometimes. The Office 365 setup for it has been streamlined, but there are occasional setup issues that can be difficult to resolve or confusing.
  3. If your ADFS server goes down for any reason, Office 365 can’t be accessed. This means that a High Availability ADFS cluster is very beneficial. It’s also expensive.
  4. In short, ADFS has a significantly higher cost to use than password sync, but it is also more secure.

Password Sync

  1. Password sync copies the “hash” for the AD password to Office 365. This means that if Office 365 gets taken over by hackers (very very unlikely, but still a potential concern), they also get to take over your network because they have all your password hashes. This doesn’t happen with ADFS.
  2. The Synchronization between Office 365 and AD occurs on a scheduled basis. This occurs every 30 minutes at a minimum, so if you change someone’s password in AD, you have to wait up to 30 minutes for the password to change in Office 365. This can be very confusing for users and result in a lot of time consuming support calls, particularly if you enable account lockout in Office 365. You can force syncs to occur, but this does add a good bit of administrative time to the password change process.

Issue Mitigation

There are some ways to get around the issues involved with each solution. For instance, Microsoft is currently working on a cloud-based version of ADFS that will allow you to have ADFS level security without the added infrastructure and administrative costs of an ADFS server/cluster. They also provide an “upgraded” version of Azure AD (which is the back-end system for account management in Office 365) called Azure AD Premium. AAD Premium costs about 4 dollars a month, but allows you to provide your users with self-service password reset features and adds attribute “write-back” capabilities that allow you to manage users in the cloud when using ADConnect, which isn’t possible otherwise, meaning you can change distribution group membership, user passwords, and other attributes in Office 365 and those changes will by written to your AD environment.

Decisions

In the end, the decision between ADFS and Password Sync is entirely up to you. If you have major regulatory governance requirements or are very concerned about security, ADFS is a very capable system that will greatly improve system security for Office 365. However, if you work for a small organization with little to no major security concerns, Password sync will provide you with a lot of benefit.

Update – 10/30/2017

It’s been a while since I wrote this post, but a number of changes to ADFS and the addition of Passthrough authentication using AD Connect mean that I need to update some of the conclusions here, and will definitely change the solution you may choose.

  1. Password Sync has a specific limitation for environments that use limitations to logon hours in Active Directory. Because the attributes for logon hours are not properly synced through Azure AD Connect, logon hour limitations will not function in Office 365 when using Password Sync. ADFS authenticates against AD directly, so it will not allow users to log in if AD says they are outside of their login hours window(s).
  2. Passthrough Authentication in Azure AD Connect *greatly* improves authentication in Office 365 by creating an authentication that passes credentials to AD through Azure ADConnect, rather than storing password hashes in the cloud. This significantly reduces the security risks associated with using password sync.
  3. ADFS in Server 2012 R2 and later allows a pretty awesome feature that I wasn’t aware of til just now, a self-service password reset portal tied to the ADFS portal. https://blogs.msdn.microsoft.com/samueld/2015/05/13/adfs-2012-r2-now-supports-password-change-not-reset-across-all-devices/ covers this in greater detail.

 

Advertisements

Do I need Anonymous Relay?

Problems

If you have managed an Exchange server in the past, you’ve probably been required to set things up to allow printers, applications, and other devices the ability to send email through the Exchange server. Most often, the solution to this request is to configure an Anonymous Open Relay connector. The first article I ever wrote on this blog was on that very subject: http://wp.me/pUCB5-b .  If you need to know what a Relay is, go read that blog.

What people don’t always do, though, is consider the question of whether or not they need an anonymous relay in Exchange. I didn’t really cover that subject in my first article, so I’ll cover it here.

When you Need an Open Relay

There are three factors that determine whether an organization needs an Open Relay. Anonymous relay is only required if you meet all three of the factors. Any other combination can be worked around without using anonymous relaying. I’ll explain how later, but for now, here are the three factors you need to meet:

  1. Printers, Scanners, and Applications don’t support changes to the SMTP port used.
  2. Printers, Scanners, and Applications don’t support SMTP Authentication.
  3. Your system needs to send mail to email addresses that don’t exist in your mail environment (That is to say, your system sends mail to email addresses that you don’t manage with your own mail server).

At this point, I feel it important to point out that Anonymous relays are inherently insecure. You can make them more secure by limiting access, but using an anonymous relay will always place a technical solution in the environment that is designed specifically to circumvent normal security measures. In other words, do so at your own informed risk, and only when it’s absolutely required.

The First Factor

If the system you want to send SMTP messages doesn’t allow you to send email over a port other than 25, you will need to have an open relay if the messages the system sends are addressed to email addresses outside your environment. The bold stuff there is an important distinction. The SMTP protocol defines port 25 as the “default” port for mail exchange, and that’s the port that every email server uses to receive email from all other systems, which means that, based on modern security concerns, sending mail to port 25 is only allowed if the recipient of the email you send exists on the mail server. So if you are using the abc.com mail server to send messages to bob@xyz.com, you will need to use a relay server to do it, or the mail will be rejected because relay is (hopefully) not allowed.

The Second Factor

If your system doesn’t allow you to specify a username and password in the SMTP configuration it has, then you will have to send messages Anonymously. For our purposes, an “anonymous” user is a user that hasn’t logged in with a username and password. SMTP servers usually talk to one another Anonymously, so it’s actually common for anonymous SMTP access to be valid and is actually necessary for mail exchange to function, but SMTP servers will, by default, only accept messages that are destined for email addresses that they manage. So if abc.com receives a message destined for bob@abc.com, it will accept it. However, abc.com will reject messages to jim@xyz.com, *unless* the SMTP session is Authenticated. In other words, if bob@abc.com wants to send jim @xyz.com a message, he can open an SMTP session with the abc.com mail server, enter his username and password, and send the message. If he does that, the SMTP server will accept the message, then contact the xyz.com mail server and deliver it. The abc.com mail server doesn’t need to have a username and password to do this, because the xyz.com mail server knows who jim@xyz.com is, so it just accepts the message and delivers it to the correct mailbox. So if you are able to set a username and password with the system you need to send mail with, you don’t need anonymous relay.

The Third Factor

Most of the time, applications and devices will only need to send messages to people who have mailboxes in your environment, but there are plenty of occasions where applications or devices that send email out need to be able to send mail to people *outside* the environment. If you don’t need to send to “external recipients” as these users are called, you can use the Direct Send method outlined in the solutions below.

Solutions

As promised, here are the solutions you can use *other* than anonymous relay to meet the needs of your application if it doesn’t meet *all three* of the deciding factors.

Authenticated Relay (Factor #3 applies)

In Exchange server, there is a default “Receive Connector” that accepts all messages sent by Authenticated users on port 587, so if your system allows you to set a username and password and change the port, you don’t need anonymous relaying. Just configure the system to use your Exchange Hub Transport server (or CAS in 2013) on port 587, and it should work fine, even if your requirements meet the last deciding factor of sending mail to external recipients.

Direct Send (Factor #2 applies and/or #3 doesn’t apply)

If your system needs to send messages to abc.com users using the abc.com mail server, you don’t need to relay or authenticate. Just configure your system to send mail directly to the mail server. The “direct send” method uses SMTP as if it were a mail server talking to another mail server, so it works without additional work. Just note that if you have a spam filter that enforces SPF or blocks messages from addresses in your environment to addresses in your environment, it’s likely these messages will get blocked, so make allowances as needed.

Authenticated Mail on Port 25 (Only factor #1 applies)

If the system doesn’t allow you to change the port number your system uses, but does allow you to authenticate, you can make a small change to Exchange to allow the system to work. This is done by opening the Default Receive connector (AKA – the Default Front End receive connector on Exchange 2013 and later) and adding Exchange Users to the Permission settings on the Security tab as shown with the red X below:

default-front-end-enabled

Once this setting is changed, restart the Transport service on the server and you can then perform authenticated relaying on port 25.

Conclusion

If you do find you need to use an anonymous relay, by all means, do so with careful consideration, but always be conscious of the fact that it isn’t always necessary. As always, comments questions on this article and others are always welcome and I’ll do my best to answer as soon as possible.

What is a DNS SRV record?

If you’ve had to work with Active Directory or Exchange, there’s a good chance you’ve come across a feature of DNS called a SRV record. SRV records are an extremely important part of Active Directory (They are, in fact, the foundation of AD) and an optional part of Exchange Autodiscover. There are a lot of other applications that use SRV records to some degree or another (Lync/Skype for Business relies heavily on them, for instance).The question, though, is why SRV records are so important and what exactly do they do?

What does a SRV record do?

The purpose of a SRV record is found in its longer, more jargon filled name: Service Locator Record. It’s basically a DNS record that is meant to allow applications to find a Server that is providing a Service the application needs to function. They provide a centralized method of configuration and control of applications that result in less work configuring the client of a client/server based application.

For example, let’s say you’re an application designer and you are creating an application that needs to talk to a server for some reason. Prior to the existence of SRV records in DNS, you had two choices:

  1. Program the application so it only ever talked to a server if it had a specific name or IP address
  2. Include some configuration settings in the application that would let end users put in the DNS name of the server.

Both of these options are not very useful for usability. Hard-coding IP addresses or host names for the server makes setup difficult and very strict in its requirements. Making end users enter the server information usually causes a lot more work for IT staff, as they would usually be required to do this for all the users.

SRV records were first added to the DNS protocol’s specifications around the year 2000 to give programmers another option for designing Client/Server based software. With SRV records, the application can be designed to look for a SRV record and get server information without having be directly configured by end users or IT staff. This is similar to the first option above, but allows greater flexibility because the server can have any name or IP address you want and the application can still find it. Some of the advanced features of SRV records also allow failover capabilities and a lot of other cool stuff.

How do SRV Records Work?

Since Active Directory relies so heavily on SRV records, let’s use it as an example to explain how they work. First, let’s take a look at a typical AD DNS zone. Below, you can see a picture that shows the fully expanded _MSDCS zone for my test lab:srv-records-for-sysinteg

This shows the _Kerberos and _ldap SRV records created by a Domain Controller (Megaserver). Here’s basically what those records are for:

  1. Windows Login requires a Domain-Joined client to connect to a Domain Controller
  2. The login system is programmed to find a Domain Controller by looking for a SRV record at _ldap.Default-First-Site-Name._sites.DC._msdcs.sysinteg.ad
  3. The SRV record listed above has a value that returns megaserver.sysinteg.ad as the location of the server providing the _ldap service.
  4. The computer’s programming fills in a blank left for whatever value the _ldap service returns with the value that is returned (megaserver.sysinteg.ad).
  5. The computer then talks to megaserver.sysinteg.ad exclusively for all functions that require it to use LDAP (Which is the underlying Protocol used by AD for what it does).

If SRV records didn’t exist, we would be required to manually configure every computer on the domain to use megaserver.sysinteg.ad for anything related to AD. Now, that’s certainly not an unfeasible solution, but it does give us a lot more work to do.

What Makes up a SRV record?

A SRV record has a number of settings that are required for them to function. To see all the settings, look at the image below:

autodiscover

That shows an Exchange Autodiscover SRV record. I’ll explain what each setting here does:

Domain: This is an un-changeable value. It shows the DNS Domain the SRV record belongs to.
Service: This is the “service” the SRV record will be used to define. In the image, that service is Autodiscover. Note that all SRV records should have an Underscore at the start, so the service value is _autodiscover. The underscore prevents issues where there might be a regular A record with the same name as a SRV record.
Protocol: This is the Protocol used by the service. This can functionally be anything, since the protocol in a SRV record is usually only meant to organize SRV records, but it’s best to use the protocols allowed by RFC 2782 to ensure compatibility (_tcp and _udp are universally accepted), but the Protocol can be anything. Unless you are designing software that uses SRV records, you’ll never be in a situation where you’ll have to make a decision about what to put as the Protocol. If you’re trying to configure a SRV record for some application that you are setting up, just follow the instructions when creating a SRV record.
Priority: In a situation where multiple servers are providing the same service, the Priority value determines which server should be contacted first. The server chosen will always be the one with the lowest number value here.
Weight: In a situation where you have multiple SRV records with the same Service value and Priority value, the Weight is used to determine which server should be used. When the application is designed according to RFC 2782, the Weight value of all SRV records is added together to determine the full Weight. Whatever portion of that weight a single SRV record is assigned determines how often a server will be used by the application. For instance, if you have 2 SRV records with the same Service and Priority where Server 1 has a weight of 50 and Server 2 has a weight of 25, Server 1 will be chosen by the application as its service provider 2/3s of the time because it’s weight of 50 is 2/3s of the total weight assigned, or 75. Server 2 will be chosen the remaining 1/3 of the time. If there’s only one server to host the service, set this value to 0 to avoid confusion.
Port Number: This setting provides Port data for the application to use when contacting the server. If, for instance, your server is providing this service on port 5000, you would put 5000 in as the Port number. The setting here is defined by how the server is configured. For Autodiscover, as shown above, the value is 443, which is the default port designated by the HTTPS protocol. The Autodiscover Website in my environment is being hosted on the default HTTPS port, so I put in port 443. If I wanted to change my server to use port 5000, I could do so, but I would need to update my SRV record to match (As an aside, if I wanted to change the port Autodiscover was published on, I would be required to use a SRV record for Autodiscover to work, as opposed to any other method).
Host Offering this Service: This is, put simply, the host name of the server we want our clients to communicate with. You can use an IP address or a Host name here, but it’s generally best to use the Host name, since IPs can and do change over time.

Using SRV Records to Enable High Availability

If you managed to read through all the descriptions of those settings up there, you may have noticed my explanation of the Priority and Weight settings. Well, those two settings allow for one of the best features of SRV records: High Availability.

Prior to the existence of SRV records, the only way you could use DNS to enable high availability was to use a feature called Round Robin. Round Robin DNS is where you have multiple IP addresses assigned to one host name (or A record). When this is set up, the DNS server will alternate between all the IPs assigned to that A record, giving the first IP out to the first client, the second IP to the second client, the third IP to the third client, and the first IP again to the fourth client (assuming 3 IPs for one A record).

With a SRV record, though, we can configure much more advanced and capable High Availability features by having multiple SRV records that have the same Service Name, but different combinations of Priority and Weight.

When we use SRV records, we have two options for high availability: Failover and Load Balancing. We can also combine the two if we wish. To do this, we manipulate the values of Priority and Weight.

If we want failover capabilities for our application, we would have two servers hosting the service and configure one server with a lower Priority value than the second. When the application performs a SRV record lookup, it will retrieve all the SRV records and attempt to contact all servers until it gets a response, using the Priority value to determine the order. A lower Priority value will be contacted first.

If we want to have load balancing for the application (all servers can be used at any time), we have multiple SRV records with the same service name, like with the Failover solution, and the same Priority value. We then determine how much of the load we want each server to take. If we have two servers providing the same service and want them to share the load equally, we pick any even number between 2 and 65534 (65535 is the highest possible Weight value) then divide that number by 2. The resulting value is entered for the Weight on both servers. When a client queries the SRV record, it will receive all values that match the SRV record, calculate the total weight, and then pick a random number between 1 and whatever the total weight value of all SRV records is to determine which server to talk to.

For instance, if you had Server 1 and Server 2 both with a Weight of 50 in their SRV record, the client would assign half of the total weight value, 100, to Server 1 and half to Server 2. Let’s say it assigns 1-50 to Server 1 and 51-100 to Server 2. The client would then pick a number between 1 and 100. If it picked a number between 1 and 50, the client would communicate with Server 1. Otherwise, it would talk to Server 2. Note: Because this functions using a random number, you will not always end up with a results that match the calculated expectations. Also note: The system used to determine which system is used, based on the Weight value, is determined by the application’s developer. This is just a simple example of how it can work. Some developers may choose a scheme that always results in an exact load distribution.

The Weight value can be used with as many servers as you want (up to 65534 servers), and with any percentage amount you want to define your load balancing scheme. You can have 4 Servers, with only three providing service 33% of the time, while the fourth server only gets chosen when all others are down by setting the weight for three SRV records to 33 and the fourth to 0. Note that a value of 0 means that the server is only chosen when all others are unavailable. You should not set multiple copies of the same SRV record with weights of 0.

Lastly, you can combine Priority and Weight to have multiple load balanced groups of servers. This isn’t a very common solution, but it is possible to have Server 1 and 2 using priority 1 and weight 50, with Server 3 and 4 using priority 2 with weight 50. In this situation, Servers 1 and 2 would provide 50 percent of the system load, but if both Server 1 and 2 stopped working, Server 3 and 4 would then be used, while distributing the load between themselves.

Tinkering with AD

If you want to see how SRV records can be used to handle high availability and get a good example of a system that uses SRV records to their fullest capabilities, try tinkering with some of your AD SRV records. By manipulating Priority and Weight, you can force clients to always use a specific DC, or configure them to use one DC more often than others.

Try modifying the Weight and Priority of the various SRV records to see what happens. For instance, if you want one specific DC in your environment to handle Kerberos authentication and another one to hand LDAP lookups, change the priorities of those records so one server has a 0 in Kerberos and 100 in LDAP, while the other has 100 in Kerberos and 0 in LDAP. You can also tinker with the Weight to give a DC with more resources priority over smaller, backup DCs. Give your monster DC a weight of 90 and a tiny, possibly older DC a weight of 10. By default, Clients in AD will pick a DC at random.

The easiest way to see this in action is to set one DC with a Priority of 10 and another with a priority of 20 on all SRV records in the _msdcs zone. Then make sure the DNS data is replicated between the DCs (either wait or do a manual replication). Run ipconfig /flushdns on a client machine and log out, then back in. Run SET LOGONSERVER in CMD to see which DC the computer is using. Now, switch the priorities of the SRV records in DNS, wait for replication, run ipconfig /flushdns, then then log out and back in again. Run SET LOGONSERVER again and you should see that the second DC is now chosen.

Final Thoughts

As I mentioned, much of a SRV record’s configuration is determined by Software Developers, since they define how their application functions. To be specific, as an IT administrator or engineer, you’ll never be able to decide what the Service Name and Protocol will be. Those are always determine by software developers. You’ll also never be in control of whether or not an application will use SRV records. Software Developers have to design their applications to make use of SRV records. But if you take some time to understand how a SRV record works, you can greatly improve functionality and security for any and all applications that support configuration using SRV records.

If you’re a Software Developer, I have to point out the incredible usefulness of SRV records and the power they give to you. Instead of having to hard-code server configurations or develop UIs that allow your end users to put in server information, you can utilize SRV records to partially automate your applications and make life easier for the IT people who make your software work. SRV records have been available for almost 2 decades now. It’s about time we started using them more and cut down the workload of the world’s IT guys.

 

 

A Treatise on Information Security

One famous misquote of American Founding Father Ben Franklin goes like this, “Anyone who would sacrifice freedom for security deserves neither.” At first glance, this statement speaks to the heart of people who have spent hours waiting in line at the airport, waiting for a TSA agent to finish groping a 90 year old lady in a wheel chair so they can take off their shoes and be guided into a glass tube to be bombarded with the emissions of a full body scanner. But the reality of any kind of security, and Information Security in particular, is that any increase of security requires sacrificing freedom. The question we all have to ask, as IT professionals tasked with improving or developing proper security controls and practices, is whether or not the cost of lost freedom is worth the amount of increased security.

The Balancing Act

If you were to dig a little, like I have, you would find that Mr. Franklin actually said, “Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety.” This version of the quote demonstrates very eloquently one of the principle struggles of developing security policies in IT. After all, there is a famous axiom in the Industry (it’s quote day here at ACBrown’s IT World), “The most secure computer is unplugged.” Or something like that. I’m probably misquoting.

In a humorous demonstration of that axiom, I present a short story. When I was a contractor performing DIACAP (Go look it up) audits on US military bases, we were instructed to use a tool called the “Gold Disc.” The Gold Disc was developed by personnel in the military to scan through a workstation or server and check for configuration settings that violated the DISA (That’s the Defense Information Systems Agency) STIG (That’s Security Technical Implementation Guide. Not the guy that drives cars for that one TV show). The Gold Disc was a handy tool, but the final screen that gave you the results of the scan had a little button on it that we were expressly forbidden from ever pushing. That button said, simply, “Remediate All.” Anyone who pushed that button would find that they were instantly locked out of the network, unable to communicate with anything. Pushing the button on an important server would result in mass hysteria, panic, and sudden loss of employment for the person who pushed the button. You see, the Remediate All button caused the tool to change every configuration setting to comply exactly with the DISA STIG recommendations. If you’re not laughing yet, here’s the puchline…Perfectly implementing the DISA STIG puts computers in a state that makes it impossible for them to communicate with one another properly. <Insert follow up joke regarding Government and the problems it causes here>.

On the other hand, computers that blatantly failed to comply with the DISA STIG recommendations would (theoretically) be removed from the network (after 6 or 7 months of bureaucratic nonsense). In the end, there was a point in the middle where we wanted the systems to be. That balancing point was the point where computers were secure enough to prevent the majority of attacks from succeeding, but not so secure that they significantly inhibited the ability of people to do their jobs effectively and in a timely matter. As IT Security professionals, we have a duty to find the right balance of security and freedom for the environments we are responsible for.

The Costs of Security

Everything in IT has a cost. The cost can’t always be easily quantified, but there is always a cost associated. For instance, something as simple as password expiration in Active Directory has a very noticeable cost. How much time do system administrators spend unlocking accounts for people who forgot their password after it just reset? Multiply the number of hours spent unlocking accounts and helping people reset their passwords by the amount of money the average system administrator makes and you get the cost of that level of security in dollars. But that is only the direct cost.

Implementing password expiration and account lockout policies also reduce the level of freedom your employees have in controlling their user accounts. That lost freedom also translates into lost revenue as employees are forced to spend their time calling tech support to get their password reset. Then you also consider lost productivity due to people wasting time trying to remember the password they set earlier that morning.

With some estimates showing that nearly 30 percent of all help-desk work hours are devoted to password resets, the cost of enabling password expiration climbs pretty high.

The Cost of Freedom

On the other hand, every day an individual goes without resetting their passwords increases the likelihood of that password being discovered. Furthermore, every day a discovered password is left unchanged increases the likelihood of that password being used by an unauthorized individual. If the individual who lost the password is highly privileged (a CEO for example), the cost to the business who employs that individual can be astronomical. There are numerous cases of companies going bankrupt after major intrusions linked to exposed passwords

So while it may cost a lot to implement a password expiration policy, it can cost infinitely more not to. In comparison, the cost of implementing a password expiration policy is almost always justified. This is particularly true when working for organizations that fall under the purview of Regulatory Compliance laws (Queue the dramatic music).

Regulatory Compliance

One of the unfortunate realities of the IT world is that some organizations have outright failed to consider the costs of *not* having a good security policy and just plain failed to have good security. Those organizations got hit hard and either lost data that cost the business huge amounts of money, or worse, data that put their customers at risk of identity theft. So, because the kids couldn’t play safe without supervision, most Governments around the world have developed laws that tell businesses in key industries things that they must do when developing their IT infrastructure.

For instance, the Healthcare industry in the US must follow the HITECH addition to HIPAA (so many acronyms) which mandates the need for utilizing IT infrastructure that prevents the unauthorized disclosure of certain types of patient information. Publicly owned corporations in the US are required to follow the rules outlined in the Sarbanes Oxley act, which requires companies to maintain adequate records of business dealings for a significant period of time. The aforementioned DIACAP audits are performed to verify whether military installations are complying with the long list of instructions and requirements developed by the DoD (if you ever have trouble sleeping…).

Organizations that fall under the umbrella of one or more Regulatory Compliance laws are compelled to ensure their IT infrastructure meets the defined requirements. Failing to do so is often punishable with significant fines. Failing to do so and getting attacked in a way that makes use of security holes meant to be plugged by regulations is a huge problem (not just for the organization itself). For regulatory compliance applicable organizations, the costs associated with violating regulations must always be considered when developing a security policy. This is mostly a good thing, since the costs of actually meeting the regulations is occasionally extremely high.

Mitigating Costs – Not Always Worth It

There are actually a lot of technical solutions in the IT industry that exist entirely to reduce the costs associated with implementing security technologies. For instance, utilizing a Self-Service Password Reset (SSPR, cause that’s a lot of typing) solution can significantly reduce the number of man-hours required by help-desk staff to reset passwords and unlock accounts. But such solutions also have costs associated with them. Aside from the purchase cost, many of these solutions significantly reduce security in an organization.  SSPRs, again, increase user freedom and control of their user account, which makes things less secure again. However, depending on the SSPR in use, how much security is reduced depends on how users interact with the software. An SSPR that only requires someone to enter their username and current password is likely to reduce security significantly more than an SSPR that requires users to answer 3 “security questions,” which will, in turn, reduce security much more than an SSPR that requires people to provide their Social Security Number, submit a urine sample, and authenticate with a retina scan while sacrificing a chicken from Uruguay with a special ceremonial dagger. But, again, the time spent by employees resetting their own password (not to mention the cost of importing chickens from Uruguay) increases the cost of such solutions. The key to determining which solutions and technologies to use is a matter of finding the right balance of freedom and security in the environment.

When Security Costs Too Much Freedom

There are times when the financial costs and the cost of freedom associated with a security measure are obviously too high (I’m looking at you, TSA). Implementing longer passwords may have many technical security advantages, but doing so includes a risk that the loss of freedom is too great for people to handle. For instance, implementing a 20 character minimum password policy that includes password complexity requirements might cause some employees with bad memories to write their password down and put it in a place that easy for them to remember. Like on a post-it note stuck to their monitor. Suddenly, that very secure password policy is defeated by a low-tech solution. Now you have a password accessible to anyone walking around in the office (like Janitor Bob) that can be used to access critical information and sell it to the highest bidder (AKA, your competitor). This is a prime example of the unconsidered costs of security being too high. Specifically, the security requirement costs so much freedom and negatively impacts employees so much that they end up bypassing security entirely.

Balancing Act

In the end, IT security is a massive balancing act. To properly balance security and freedom in IT, it is necessary to ask questions and obtain as much knowledge about the environment as possible. The investigative part is among the most important phases in any security policy. Organizations looking to increase security need to have balance in their security implementations. Decisions on IT security must always be thoughtful ones.

Disabling Direct Access Forced Tunneling

So you’re trying to get Direct Access (DA) running in your environment and you suddenly realized that your test machine can no longer access…anything. Well, this may be due to the “accidental” enabling of “Forced Tunneling” in your DA configuration. How do you fix it? You can pretty easily reconfigure your DA configuration to disable Forced Tunneling, but unless your test machine is directly connected to your AD environment, you’ll never be able to get the Group Policy updates on your test machine. Now, you *should* be connected if you’re doing this, but there are some situations where that’s less than possible (Remote workers unite!).

Disabling Forced Tunneling on Client Machines

I’ll give the way to fix this first, and explain why this happens second:

  1. Open Regedit
  2. Navigate to HKLM:\Software\Policies\Microsoft\Windows\TCPIP\v6Transition
  3. Set all visible entries to Disabled
  4. Delete all subkeys.
  5. Reboot
  6. Rejoice

Why Does This Happen?

Well, if your DA configuration is not configured perfectly, you can’t initialize a DA session. So if, for instance, your Computer Client Certificate fails to enroll properly before you disconnect, and your machine has obtained the DA settings from Group Policy, you’re stuck dealing with all the settings required to connect to DA, but can’t actually do so. With Forced Tunneling enabled, you are forcing all DA client systems to go through DA for *any* internet connectivity. So if your DA DNS settings also configure things to point to an Internal IP for DNS lookups when connected, congratulations…you can’t reach a dang thing. Disabling Forced Tunneling in the registry is about your only option here. Just make sure you’ve also disabled Forced Tunneling in your DA config before you disconnect from the VPN again or you’ll have to do this stuff all over again. (Oops)

Final Note

Don’t use Forced Tunneling with Direct Access. It provides no additional security and is a huge pain in the butt if DA doesn’t connect properly for *any* reason.

Anatomy of a Certificate Error

The most important step in diagnosing a specific security error involves determining what the error is telling you. There are a few things that can cause certificate errors, and what you do depends entirely on what is causing the error to begin with. Once you know what the error is telling you, it becomes much easier to figure out what you need to do next.

Getting the Message

One of the more concise and effective Certificate Errors is the one delivered by Outlook. An image of it is below.

 

ssl-name

Note the numbers 1, 2, and 3. These don’t normally show up on the error because I put them there for reference. In case you were looking at your own error. At any rate, the numbers are sitting next to three possible kinds of errors you can get with a certificate.

For this particular error, you’ll note that there is a red X next to number 3. That X points out that one of the Validity checks run against the certificate failed. Specifically, the name I used to access the server doesn’t match either the Common Name on the certificate or any of the Subject Alternate Names. This is probably the most common certificate error you’ll see.

The four Checks

Every time you access a website that is secured with SSL, there are four checks the computer you use runs to verify that the certificate is valid. The reason for these checks is explained in my article on Digital Certificates. The four checks are as follows, and match the numbering in the image above.

  1. Was the Certificate issued by a known and trusted Certificate Authority?
  2. Is the current date within the period of time the Certificate is valid?
  3. Does the host name used to access the server match any of the host names defined by the certificate?
  4. Has the Certificate been Revoked? (Wait, there’s no 4 on the image! Don’t worry, I’ll explain later.)

If any of these checks fails, you’ll get a certificate error. Note that this *does not mean* that the data you’re trying to encrypt isn’t going to be encrypted. Any time you use SSL or TLS, you’re data will be encrypted whether the certificate is valid or not. However, if any of the checks fail it is much more likely that someone could decrypt the data you encrypt. Here’s why, based on each of the possible certificate errors.

Was the Certificate issued by a known and trusted Certificate Authority?

Certificate Authorities are servers that are designed specifically to generate digital certificates. Anyone on the planet can create a Certificate Authority server if they want to (and know how to). If you have your own Certificate Authority, you can create a certificate that matches any Common Name you want and use that certificate to interject yourself into any secure transmission and read the data without anyone knowing, but only if the client computer *trusts* your Certificate Authority.

Normally, most computer Operating Systems and Web Browsers have a list of CAs that are trusted right out of the box. These include CAs owned by companies like Godaddy, Entrust, and Network Solutions. So unless you happen to gain control of the Certificate Authority owned by these companies and defined by the Root CA Certificate installed on the OSes of every computer in the world, your CA is probably not going to be trusted without a lot of extra work.

If you see a certificate error that warns the Certificate Authority isn’t trusted, it means the Certificate was issues by a *private* CA. You can instruct your computer to trust the CA if you want, but if you are using a site that normally has no certificate error and this error suddenly shows up one day, there’s a good chance your data is being intercepted and redirected.

As an IT Professional, if you see this error when accessing a system under your control, there are two solutions.

  1. Request a new certificate from a trusted, Third Party Root CA provider.
  2. Install the Root CA certificate as a Trusted Third Party Root CA in the OS.

#1 requires significantly less effort to accomplish because it means you don’t have to actually install the certificate on your users’ computers, phones, or other devices.

Is the current date within the period of time the Certificate is valid?

Certificates are only valid for a set period of time. Most certificates are valid from 1 – 3 years from the time they are first generated, depending on options used during certificate generation. Certificate Validity periods are meant to ensure that only a limited amount of time is available for a certificate’s Private Key to be discovered.

The possibility of a brute force attack successfully discoverying the Private Key in use is astronomically small, and the time to run a full brute force attack against modern Certificates is in the million year period. But as we progress technologically, the time required reduces exponentially. If a certificate was generated in, say, 1991 using the DES algorithm, it would have taken thousands of years to crack it with normal computing resources. Today, it would take less than an hour.

Having a certificate validity period ensures that technology doesn’t outpace the security of the certificate. Having a validity period between 1 and 3 years is the general recommendation for certificates these days. If you run across a certificate that has an expiration date that is more than 2-3 years in the past, I highly recommend not using the site that uses that Certificate.

If a server you control has this error, you need to generate and install a new certificate on the server. This is the only possible solution to this error.

Does the host name used to access the server match any of the host names defined by the certificate?

This error is always caused by attempts to access a server using a URL that uses a host name not included on the certificate. For instance, let’s say a web server has a certificate that defines the host name as acbrownit.com. If you attempt to reach that server using http://www.acbrownit.com, you’ll get a certificate error.

This check is meant to ensure that the server we are communicating with is the one we *want* to communicate with. If the server we want to talk to is using a valid third-party certificate, we can be significantly more certain that the server we’re talking to is the one we want to talk to and that no one is attempting to spy on the data we send if this check comes back okay. If not, it’s important to check the information listed on the certificate to verify that we’re talking to the right server.

For IT Professionals, there are two definite solutions for this error.

  1. Generate a certificate that matches the Host Name you want people to use to access your server. If there is a need for multiple names, get a SAN cert that includes all host names or a Wildcard cert that is valid for any host name at a specific domain (Wilcard certificates are generated with a common name of *.domain.com and will be considered valid for any value that you want to replace the *. This is slightly less secure since it can be used on any number of servers, but the security difference is minimal. Be certain to verify that the web server you are using fully supports wildcard certificates before obtaining one. IIS supports them, as do the vast majority of Microsoft solutions, though some may require additional setup.)
  2. Create a DNS record for a host name that matches the certificate and point it to the web server.
  3. Note: There are some applications that use HTTPS that may have specific host name requirements, and may require multiple host names to function properly (Exchange Autodiscover for example). Be aware that this type of certificate error will always occur unless you have a certificate that matches *all* the necessary host names or have made sufficient configuration changes to allow things to work properly with a single host name.

Has the Certificate been Revoked?

This is actually a very unusual error that you will not see often. I don’t have a picture of one of these to show you, since it takes a good bit of effort to force the error to occur. Certificate Revocation is not particularly common, and was developed to combat the possibility of a certificate being compromised. A certificate is considered compromised when an unauthorized entity obtains a copy of the certificate’s Private Key.

If this happens, or if the certificate is reissued for any reason (For instance, if you want to change the common name, modify the list of Subject Alternate Names, or make any other changes), the certificate is listed in a Certificate Revocation List (CRL) that is published by the server that originally generated the certificate. A CRL is just a simple web page that a web browser or other application that checks certificate validity will go to and check to determine if the certificate is still valid. If the certificate is listed in the CRL, many applications are designed to absolutely refuse further communication with the server using that certificate (Web browsers specifically). Servers using revoked certificates are always considered to be compromised, and it is always a good idea to avoid using servers with revoked certificates. Basically, if you see this error, *DO NOT CONTINUE!*.

Brown’s Anatomy

So that’s it for Certificate errors. The 4 checks are designed primarily to keep your data safe, so make sure you are aware of what you’re walking into when you see these errors. As a regular joe, non-IT person, you’re pretty likely to run into these errors, and knowing what they mean will help you determine if it’s a good idea to keep going or not. For IT people, you are going to see these errors a lot, no matter what, and knowing what they mean will help you fix them.

 

How Does Exchange Autodiscover Work?

Autodiscover is one of the more annoying features of Exchange since Microsoft reworked the way their Email solution worked in Exchange 2007. All versions since have implemented it and Microsoft may eventually require its use in versions following Exchange 2016. So what is Autodiscover and how does it work?

Some Background

Prior to Exchange 2007, Outlook clients had to be configured manually. In order to do that, you had to know the name of the Exchange server and use it to configure Outlook. Further, if you wanted to use some of the features introduced in Exchange 2003 SP2 and Outlook 2003 (and newer), you had to manually configure a lot of settings that didn’t really make sense. In particular, Outlook Anywhere requires configuration settings that might be a little confusing to the uninitiated. This got even more complicated in larger environments that had numerous Exchange servers but could not yet afford the expense of a load balancer.

The need to manually configure email clients resulted in a lot of administrative overhead, since Exchange admins and Help Desk staff were often required to configure Outlook for users or provide a detailed list of instructions for people to do it themselves. As most IT people are well aware, even the best set of instructions can be broken by some people, and an IT guy was almost always required to spend a lot of time configuring Outlook to talk to Exchange.

Microsoft was not deaf to the cries of the overworked IT people out there, and with Exchange 2007 and Outlook 2007 introduced Autodiscover.

Automation Salvation!

Autodiscover greatly simplifies the process of configuring Outlook to communicate with an Exchange server by automatically determining which Exchange server the user’s Mailbox is on and configuring Outlook to communicate with that server. This makes it much easier for end users to configure Outlook, since the only things they need to know are their email address, AD user name, and password.

Not Complete Salvation, Though

Unfortunately, Autodiscover didn’t completely dispense with the need to get things configured properly. It really only shifted the configuration burden from Users over to the Exchange administrator, since the Exchange environment has to be properly configured to work with Autodiscover. If things aren’t set up properly, Autodiscover will fail annoyingly.

How it Works

In order to make Autodiscover work without user interaction, Microsoft developed a method for telling Outlook where it needed to look for the configuration info it needed. They decided this was most easily accomplished with a few DNS lookups based on the one piece of information that everyone had to put in regardless of their technical know how, the email address. Since they could only rely on getting an email address from users, they knew they’d need to have a default pattern for the lookups, otherwise the client machines would need at least a little configuration before working right. Here’s the pattern they decided on:

  1. Look in Active Directory to see if there is information about Exchange
  2. Look at the root domain of the user’s email Address for configuration info
  3. Look at autodiscover.emaildomain.com for configuration info
  4. Look at the domain’s root DNS to see if any SRV records exist that point to a host that holds configuration info.

Note here that Outlook will only move from one step to the next if it doesn’t find configuration information.

For each step above, Outlook is looking for a specific file or a URL that points it to that file. The file in question is autodiscover.xml. By default, this is kept at https://<exchangeservername>/autodiscover/autodiscover.xml. Each step in the check process will try to find that file and if it’s not there, it moves on. If, by the end of step 4, Outlook finds nothing, you’ll get an error saying that an Encrypted Connection was unavailable, and you’ll probably start tearing your hair out in frustration.

What’s in the File?

Autodiscover.xml is a dynamically generated file written in XML that contains the information Outlook needs to access the mailbox that was entered in the configuration wizard. When Outlook makes a request to Exchange Autodiscover, the following things will happen:

  1. Exchange requests credentials to access the mailbox.
  2. If the credentials are valid, Exchange checks the AD attributes on the mailbox that has the requested Email address.
  3. Exchange determines which server the Mailbox is located on. This information is usually stored in the msExchangeHomeServer attribute on the associated AD account.
  4. Exchange examines its Topology data to determine the best Client Access Server (CAS) to use for access to the mailbox. The Best CAS is determined using the following checks:
    1. Determine AD Site the Mailbox’s Server is located in
    2. Determine if there is a CAS assigned to that AD site
    3. If no CAS is in the site, use Site Topology to determine next closest AD Site.
    4. Step 3 is repeated until a CAS is found.
  5. Exchange returns all necessary configuration data stored in AD for the specific server. The configuration data returned is:
    1. CAS server name
    2. Exchange Web Services URL
    3. Outlook Anywhere Configuration Data, if enabled.
    4. Unified Communications Server info
    5. Mapi over HTTPS Proxy server address (if that is enabled)
  6. Outlook will take the returned information and punch it into the necessary spots in the user’s profile information.

Necessary Configuration

Because all of this is done automatically, it is imperative that the Exchange server is configured to return the right information. If the information returned to Autodiscover is incorrect, either the mailbox connection will fail or you’ll get a certificate error. To get Autodiscover configured right, parts 5.1, 5.2, 5.3, and 5.5 of the above process must be set. This can be done with a script, in the Exchange Management Shell, and in the Exchange Management UI (EMC for 2007 and 2010, ECP/EAP for 2013/2016).

Importance of Autodiscover

With the release of Outlook 2016, it is no longer possible to configure server settings manually in Outlook. You must use Autodiscover. Earlier versions can avoid using it by manually configuring each outlook client. However, before doing that, consider the cost of having to touch each and every computer to properly configure Outlook. It can take 5 minutes or more to configure Outlook on one computer using the manual method, and with Exchange 2013 it can take longer as you also are required to input Outlook Anywhere configuration settings, which are more complex than just entering a server name, username, and password. If you multiply that by the number of computers you might have in your environment and add in the time it takes to actually get to the computers, boot them up, and get to the Outlook settings, the time spent configuring Outlook manually starts to add up very quickly. Imagine how much work you’d be stuck with configuring 100 systems!

In contrast, it usually only takes 10 to 20 minutes to configure Autodiscover. When Autodiscover is working properly, all you have to do is tell your users what their email address is and Outlook will do all the work for you. With a little more configuration or some GPO work, you don’t even have to tell them that!

When you start to look at the vast differences in the amount of time you have to spend configuring Outlook, whether or not to use Autodiscover stops being a question of preference and starts being an absolutely necessary part of any efficient Exchange-based IT environment. Learning to configure it properly is, therefore, one of the most important jobs of an Exchange administrator.