Exchange Transaction Logs – Reducing the Confusion

Exchange Transaction Logs are, in my opinion, one of the most horribly documented parts of Exchange server. There’s a lot of misinformation out there as well as a lot of misunderstanding. If you look for an answer to questions that most people have about them, you’ll run across poorly written documentation that barely explains what they are, let alone how they work. In this post, I’ll be going over the basics of Transaction Logs and explaining what they are, how they work, and, more importantly, what they are for.

What are Transaction Logs?

Transaction logs are usually kept for any type of database, so knowing what a database is helps. To put a database in perspective, just think about something we’ve all had to work with at some point in time, a spreadsheet. If you’ve ever had to compile a list of numbers and figures in Excel, you’ve used a spreadsheet. Well, databases are basically collections of spreadsheets that are inter-related, extremely large, extremely complex (in some cases), and accessible to numerous users at the same time.

In order for a database to function with lots of users at the same time who may be making changes to the same data at the same time, database systems will typically write changes to data in a transaction log, and then apply the change to the database. This keeps the data in the database from being corrupted and ensures that changes are applied in the order they are made. In a database that has two people changing the same data at the same time, the database will compare the entries and accept the most recent change if they are different. So that’s essentially what a transaction log is. It’s a record of every single operation performed that changes the state of any data in the database. Adding a new item, deleting an old item, modifying an existing item, all these functions are recorded in a transaction log before being applied to the database itself. At the very least, this is more or less a simplified explanation of how SQL handles transaction logs. For database systems like SQL, transaction logs are *extremely* important.

Exchange, on the other hand, doesn’t have the same flexibility of a highly customizable database solution like SQL. Exchange Databases are designed to handle a limited set of functions. So, much of the work in Exchange is very simple to manage. Data is automatically segregated in individual Mailboxes and those are not usually accessed by numerous users at the same time, and not much of the data stored in an Exchange database is modified regularly. Once an email is stored on an Exchange server, it doesn’t change. If an item does change in the database, it is usually recreated as a completely new object and the old version removed, rather than there being a direct modification to the stored data for that item. As a result, Exchange is not nearly as dependent on transaction logs as SQL.

How Does Exchange Use Transaction Logs?

Every time an email is delivered, sent, deleted, or forwarded, Exchange will write the information about that transaction directly to the transaction logs, then immediately to the database. The time difference between transaction log and database writes is measurable in milliseconds.

Exchange writes Transaction logs for a single purpose; database recovery. If, for some reason, the database that holds all your mailbox information fails for some reason, let’s say someone drops a giant anvil on your Mailbox server, because you never know when Wile E Coyote will strike out in anger (This is a major concern for the IT department at ACME Inc). Anyway, if your database ever gives up the metaphorical ghost, you will need to go back to your most recent backup to do a restore. The problem in that situation is that when you restore a backup of a database, you will usually end up restoring a copy that isn’t up to date with the most recent transactions. So if the last full backup you ran was on a Sunday and the live database fails on Friday, the database you restore from that full backup will be missing all the email that was sent and received between Sunday and Friday. This is where transaction logs come in. The entire purpose of transaction logs in Exchange is to provide information on the transactions that occur since the last time you ran a complete backup of your Exchange environment.

How Transaction Logs Work with the Database

One of the first things you do when configuring Exchange is define where the Database and Log files are stored. This is actually a lot more important than you might think. If you were to go to the location where your Exchange Transaction Logs are stored, you will first notice that there are a lot of log files there. Transaction Log files max out at a set size to keep down the risks of Transaction Log corruption. If all the transactions were stored in a single file and that file was corrupted somehow, you’d lose entire days of email. With multiple files, one file can be corrupted and you’d lose the ability to restore maybe an hour or two of email, which isn’t nearly that big a deal. At any rate, each transaction log file has a name that starts with the letter E and then a string of numbers, followed by the .log extension. You will also see a similarly named file with a .chk extension and a bunch of files named Eres<numbers>.jrs. The .JRS files are used by Exchange to make sure things don’t explode if the drive fills up for some reason. The .log Files are the actual Transaction Logs that are saved and the .chk file is used to determine what the most recent transaction log file name is as well as which transaction logs belong to which database. The name on these files is important because it represents the order in which those logged transactions occurred. Transactions in E00123.Log occurred before those in E00124.log and so one. Each time a log file reaches its size limit, a new file is created with an incremented number and the .chk file is updated. Another thing to remember is that the name of the last transaction log that contains the most recently applied transactions is written as a property of the actual database file that Exchange uses.

Now we get to the part where the transaction logs are important. When you mount any Exchange database, the Exchange server will do the following:

  1. Read the last transaction log property on the database (Assuming the database was properly shut down).
  2. Examine the .chk file in the Log Files directory to determine what the last log file that *should* be applied to the database is named.
  3. Examine the names of the Transaction Log files in the transaction log directory assigned to the database in Exchange.
  4. If the .chk file says that the last transaction log has a higher number than what is recorded by the database, the Exchange system will begin “replaying” the log files in the directory, applying every single transaction that occurred between what the Database you mount last saw and what the .chk file says should be the last log file. This is the step that completes the restore process.

When all of the available logs finish being replayed to the database, your database will have returned to the exact state it was in when that last log file was written. The end result is a restored database that is in the exact state the original database was in before failing. Note that this process can only occur if the database is mounted in a Recovery Storage Group (For Exchange 2003/2007), or as a Recovery Database (Exchange 2010/2013), or if the active database is flagged as allowing over-write.

So basically, the only real reason the transaction logs exist is to perform database restoration. This is why the Microsoft Best Practices state that the Transaction Logs should be on a completely different physical drive than the Database files they are associated with. If the drive that holds the database fails for some reason, you can always use the transaction log files to bring a restored database to a state that has the most recent data. And because all transactions are written to the logs *and* the database files as soon as they happen, losing your log file drive will not cause you to lose any data either. If your logs drive fails, though, you may need to run a little bit of maintenance on the database files with ESEUTIL to put them into a clean state before they will mount properly. The logs are designed to provide “Point In Time” database recovery.

Point In Time Recovery

Point in Time Recovery is a function that allows you to restore a database to the state it was in at an exact point in time. For instance, lets say someone requests that you restore a mailbox that was deleted on Wednesday of two weeks ago at 2:14PM. For this situation, let’s assume you run full backups every Sunday and incremental backups every day. If you restore the mailbox from the backup taken before that Wednesday, you may be missing some mail. If you restore the database from the backup Wednesday night, you won’t get the mailbox. So what do you do? Well, you do a Point in Time Recovery. The way you do this is you restore the Database from the last full backup that was run before the point in time you want to restore to, then you restore all the log files between then and Wednesday night’s incremental backup. Once you have all the logs and database in a good location, you would create a RSG or Recovery Database that points to that location, and then look in the folder you saved the logs to. Each of the logs will have a timestamp on them that should carry over from the backup. This timestamp will allow you to pinpoint the log file that was written right before the mailbox was deleted. Once you find that, you delete every log that came after that, then mount the Database in Exchange. The database will go up to where the .CHK file that you restore says to, but it will stop at the last log file that is available below where the chk file says. So if the last log file available is the one written at 2:13PM on Wednesday, when the database finishes replaying the available logs, it will be in the exact state it was in when that last log file was written. And there you go, you have a database that has as much mail as possible in the deleted mailbox, which you can then restore normally.

Log Growth

One of the big problems that impacts Exchange servers is out of control Log growth. Logs are written constantly and there are only two ways they can be deleted. The proper way to delete log files is to perform a Full, Exchange Aware backup. If the backup software you use is not designed to perform Exchange Database backups, your logs will never ever get cleared and you will run out of drive space, which will force all databases with log files on the full drive to dismount and the Exchange server to explode (not really. It’ll just stop working). When you run a full backup that is Exchange Aware, the backup software instructs the Exchange system to “truncate” the logs. In older database systems, truncating the logs meant that the changes in the logs were written to the database and the files removed. These days changes to the database are written directly to the database, so when the system Truncates the logs, it basically just deletes them, but it does so in a way that allows the Database to stay operational.

The other option, deleting the log files manually, doesn’t work if the database the logs belong to is mounted. So you should always try to avoid deleting log files manually unless it’s an extreme emergency. And by Extreme Emergency I mean you haven’t run a full backup in a long time and have a completely full log file drive with about 300GB of logs. If you run into that situation, you pretty much *have* to delete the log files manually, since running a full backup on that many log files can take several days to complete, since the truncation process goes through each log file to make sure its changes were applied to the database. If the Database is dismounted, it is acceptable to delete log files, but you should only do so with the understanding that you will not be able to perform a Point in Time restore from the last backup to the point in time where the logs were deleted. (Point in Time recovery requests are fairly rare, from my experience, but they do happen, especially in larger companies with a lot of legal requirements).

Circular Logging

Now, if you are okay with not having the ability to do a Point In Time restore, you can configure Exchange to use a feature called Circular Logging. Circular Logging causes the Exchange server to retain only the latest 6 or 7 log files. Log files past that are automatically deleted, so you never have to deal with out of control log growth, and you also never have to run a full Exchange aware backup to clear log files. You would use this option if your backup solution doesn’t include support for Exchange server, if you don’t have a lot of space for logs, or if you just don’t care about dealing with logs for Point in Time restores. Another situation where you would use Circular Logging is if you have a Database Availability Group with at least three copies of each database. If you configure one copy to be Lagged (A lagged database copy waits a certain amount of time before writing transactions to the database), you can run Exchange in a No Backup mode. I’ll go into more detail on this feature in a later post, but for now, just understand that if you have enough database copies and at least one Lagged copy, you already have enough functionality to do Point in Time restores going back at most 14 days, and you are pretty well protected from Database failures.

Common Misconceptions

So now that I’ve explained how the logs work and what they do, let’s go over some common misconceptions about Transaction Logs:

  1. Transactions are only written to the logs and then the logs are written to the database – This misconception is due in part to how databases functioned in the early days. Nowadays, transactions are written to memory, disk, and logs at almost the exact same time. There is a little bit of lag time between them being written to log files and the database itself, but this lag time is so miniscule that it doesn’t really matter (fractions of a second).
  2. If I do a full backup every night, I can use circular logging – This is one of those sorta kinda maybe close to accurate things, but it’s mostly wrong because it ignores the primary purpose of log files, which is to bring a restored database up to the most recent possible state it was in when the original copy was destroyed. If you run full backups every night, you still need to make sure you’re keeping all the logs from that backup time to the next backup time, otherwise when you restore your backup you will be missing up to 24 hours worth of mail. If you’re okay with that limitation, then sure, use circular logging if you run daily full backups. Otherwise, keep circular logging off.
  3. Deleting the logs manually will corrupt the database – No, it won’t. As I mentioned, deleting the logs manually is sometimes necessary, and can be done at any time in more recent versions of Exchange. The danger in doing manual log purges is data loss. You never want to delete logs that haven’t been backed up (either a full backup or an incremental/differential backup). If you’ve cleared all your logs manually and the database dies, there is no way to recover any transactions from the logs that were deleted if the files themselves haven’t been backed up. A Full, Exchange aware backup will “truncate” the logs, which is geek speak for deleting all the log files created after backing them up. This is simply to free up space, because the transaction logs are no longer needed once they have been backed up.
Advertisements

19 responses to “Exchange Transaction Logs – Reducing the Confusion

  1. great article , one question if my server did blow up on Friday and I restored email from my backups on sat Tuesday , how would having transaction logs help them i.e if the server blows up the trans logs are gone too ? So my backup wouldn’t contain the 3 days log files I’m missing ?

    • That’s correct. If the transaction logs and database files are lost, there is no way to recover mail between the failure and last backup. If you do incremental backups, you’ll usually have transaction logs for each day, and you can usually restore those and get them applied to the database.

  2. sorry typo’s in previous post meant this

    great article , one question if my server did blow up on Friday and I restored email from my backups on Tuesday , how would having transaction logs help then i.e if the server blows up the trans logs are gone too ? So my backup wouldn’t contain the 3 days log files I’m missing ?

  3. You could also compress a chunk of the log files, mount the database to get users back in and then run a full backup. This has got me out of a hole.

  4. If only you told us where these logs were located, I might like this. Told me everything I needed to know except where to find them.

    • The location isn’t always the same, since it is defined at database creation. You can usually get that info by running get-mailboxdatabase | fl
      In the exchange shell and looking for the log file location there.

  5. Great article. There is one thing I’m wondering about:
    How are the objects (message items and/or attachments) stored?
    Are these also stored in the transaction log or is it just the information about the objects and the processes handling them that’s stored in the log files?
    And if so, does it mean that every message an Exchange server receive, is available through the logs (and is it stored binary or in blob or otherwise)?

    • Transaction logs store information on every change made to the database. Specifically, it includes all information necessary to apply those changes again should the need arise. If an email is sent to the server, the transaction log will contain the full text of the message and any attachments. The logs are written in binary form, and cannot be directly read, but can be read with the proper tools.

  6. Sorry if I am late to the party, but what about a complete mailbox deletion. if I remove a 10GB mailbox, will that immediately generate 10GB of transaction logs?

    • I haven’t had a chance to verify, but theoretically a mailbox delete should only generate a very small transaction, since we’re basically just instructing the database to erase data it already knows about. That shouldn’t really result in a large amount of data that needs to be logged.

      However, copying a new mailbox to a database will generate a significant amount of transactional data on the target DB, since the transaction will include all the data in the mailbox plus the transactional code used to add that data to the database. This is the main reason best practices dictate switching to Circular logging prior to performing large mailbox migrations.

  7. This is very informative article. however, i have one question about the problem which i am facing. i have 3 nodes exchange 2010 DAG(SP3). two nodes are on primary site and 1 node is on my DR Site. previously we were operating our DR site within city so huge number of logs copy on DR site was not an issue. recently, we have moved our DR site to different city (connected through WAN Link provided by Service Provider). now what is happening that our entire bandwidth is being consumed by exchange replication. if i suspend my replication on DR for 2 hours, so after 2 hours the size of Copy Que Length becomes around 2500 (2.5GB) which i know impossible as there is no such email send/receive during 2 hours.

    Now my question is, why number transactions logs are growing exponentially? does it also records and increase size in trasnsaction log if:
    1) If i move an email from one folder to another folder within my mailbox. like manually movement or through outlook rules. for example, if i receive an email of 1MB and then afterword i move it into another folder. would it be recorded as 2 actions and 1 MB for each action (email delivered, move in different folder)

    2) does it also records in transaction files if i have enabled ourlook archiving. means
    every email older than 10 days will be archived automatically into my local PST. now for example if 10MB of emails are moved from my mailbox to local PST. would i t also create 10 MB as transaction logs as these 10 emails have removed from mail mailbox.

    • Transaction logs store every command that is run against the exchange database, so the amount the logs grow depends on how much mail the environment deals with, both internally and externally. Sending a message from one user to another will cause that message to be stored in the logs, including any attachments. I don’t know the exact underlying code used to actually do this, but there is a good possibility that the logs will have each internal message show up twice or more, depending on the number of recipients. Now, once the messages are stored in the database, there’s a good bit of deduplication that occurs to reduce DB size, but this doesn’t occur in the transaction logs themselves.

      In addition to mail transactions, maintenance windows will generate a lot of logs due to the number of move, repair, delete, and other commands run against the database. All of this has to be logged directly to ensure the database can be restored properly. It’s not uncommon for logs to take up a lot of data over time, and 2.5gb isn’t an insane amount of logs per day, depending on the number of users there are. I’ve seen exchange environments generate 30gb of logs on an average day.

      As to your questions:
      1. I’m not sure on this one, but my guess is that it would write the message to the logs 1 time when received, then the move command would re-write some of the message properties, which would be another transaction, but not near 1MB worth. Probably closer to 2-3KB.
      2. The transaction logs will record everything you do, but deletes theoretically take less data in logs than receiving an email. In this example, the data writes to the PST file wouldn’t cause a transaction log change, since it’s a read (Transactions generally only include writes, changes, and deletions). However, if the data is subsequently deleted, it would generate a good bit of log info for each individual message deleted. For deletes, size doesn’t matter as much as number, so you’d see much more transactional data being written for 1000 1KB emails being deleted than for 10 1MB emails being deleted.

  8. Like you said in the opening, I’ve read many articles about transaction logs and none if them gave me clarity on how it actually works.
    I really really appreciate it. Thank​ you for this.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s