Exchange Transaction Logs are a commonly misunderstand facet of Exchange Server. There’s a lot of misinformation out there as well as a lot of confusing documentation. In this post, I’ll be going over the basics of Transaction Logs and explaining what they are, how they work, and, more importantly, what they are for.
What are Transaction Logs?
Transaction logs are usually kept for any type of database, so knowing what a database is helps. To put a database in perspective, just think about something we’ve all had to work with at some point in time, a spreadsheet. If you’ve ever had to compile a list of numbers and figures in Excel, you’ve used a spreadsheet. Well, databases are basically collections of spreadsheets that are inter-related, extremely large, extremely complex (in some cases), and accessible to numerous users at the same time.
In order for a database to function with lots of users at the same time who may be making changes to the same data at the same time, database systems will typically write changes to data in a transaction log, and then apply the change to the database. This keeps the data in the database from being corrupted and ensures that changes are applied in the order they are made. In a database that has two people changing the same data at the same time, the database will compare the entries and accept the most recent change if they are different. So that’s essentially what a transaction log is. It’s a record of every single operation performed that changes the state of any data in the database. Adding a new item, deleting an old item, modifying an existing item, all these functions are recorded in a transaction log before being applied to the database itself. At the very least, this is more or less a simplified explanation of how SQL handles transaction logs. For database systems like SQL, transaction logs are *extremely* important.
Exchange, on the other hand, doesn’t have the same flexibility of a highly customizable database solution like SQL. Exchange Databases are designed to handle a limited set of functions. So, much of the work in Exchange is very simple to manage. Data is automatically segregated in individual Mailboxes and those are not usually accessed by numerous users at the same time, and not much of the data stored in an Exchange database is modified regularly. Once an email is stored on an Exchange server, it doesn’t change. If an item does change in the database, it is usually recreated as a completely new object and the old version removed, rather than there being a direct modification to the stored data for that item. As a result, Exchange is not nearly as dependent on transaction logs as SQL.
How Does Exchange Use Transaction Logs?
Every time an email is delivered, sent, deleted, or forwarded, Exchange will write the information about that transaction directly to the transaction logs, then immediately to the database. The time difference between transaction log and database writes is measurable in milliseconds.
Exchange writes Transaction logs for a single purpose; database recovery. If, for some reason, the database that holds all your mailbox information fails for some reason, let’s say someone drops a giant anvil on your Mailbox server, because you never know when Wile E Coyote will strike out in anger (This is a major concern for the IT department at ACME Inc). Anyway, if your database ever gives up the metaphorical ghost, you will need to go back to your most recent backup to do a restore. The problem in that situation is that when you restore a backup of a database, you will usually end up restoring a copy that isn’t up to date with the most recent transactions. So if the last full backup you ran was on a Sunday and the live database fails on Friday, the database you restore from that full backup will be missing all the email that was sent and received between Sunday and Friday. This is where transaction logs come in. The entire purpose of transaction logs in Exchange is to provide information on the transactions that occur since the last time you ran a complete backup of your Exchange environment.
How Transaction Logs Work with the Database
One of the first things you do when configuring Exchange is define where the Database and Log files are stored. This is actually a lot more important than you might think. If you were to go to the location where your Exchange Transaction Logs are stored, you will first notice that there are a lot of log files there. Transaction Log files max out at a set size to keep down the risks of Transaction Log corruption. If all the transactions were stored in a single file and that file was corrupted somehow, you’d lose entire days of email. With multiple files, one file can be corrupted and you’d lose the ability to restore maybe an hour or two of email, which isn’t nearly that big a deal. At any rate, each transaction log file has a name that starts with the letter E and then a string of numbers, followed by the .log extension. You will also see a similarly named file with a .chk extension and a bunch of files named Eres<numbers>.jrs. The .JRS files are used by Exchange to make sure things don’t explode if the drive fills up for some reason. The .log Files are the actual Transaction Logs that are saved and the .chk file is used to determine what the most recent transaction log file name is as well as which transaction logs belong to which database. The name on these files is important because it represents the order in which those logged transactions occurred. Transactions in E00123.Log occurred before those in E00124.log and so one. Each time a log file reaches its size limit, a new file is created with an incremented number and the .chk file is updated. Another thing to remember is that the name of the last transaction log that contains the most recently applied transactions is written as a property of the actual database file that Exchange uses.
Now we get to the part where the transaction logs are important. When you mount any Exchange database, the Exchange server will do the following:
- Read the last transaction log property on the database (Assuming the database was properly shut down).
- Examine the .chk file in the Log Files directory to determine what the last log file that *should* be applied to the database is named.
- Examine the names of the Transaction Log files in the transaction log directory assigned to the database in Exchange.
- If the .chk file says that the last transaction log has a higher number than what is recorded by the database, the Exchange system will begin “replaying” the log files in the directory, applying every single transaction that occurred between what the Database you mount last saw and what the .chk file says should be the last log file. This is the step that completes the restore process.
When all of the available logs finish being replayed to the database, your database will have returned to the exact state it was in when that last log file was written. The end result is a restored database that is in the exact state the original database was in before failing. Note that this process can only occur if the database is mounted in a Recovery Storage Group (For Exchange 2003/2007), or as a Recovery Database (Exchange 2010/2013), or if the active database is flagged as allowing over-write.
So basically, the only real reason the transaction logs exist is to perform database restoration. This is why the Microsoft Best Practices state that the Transaction Logs should be on a completely different physical drive than the Database files they are associated with. If the drive that holds the database fails for some reason, you can always use the transaction log files to bring a restored database to a state that has the most recent data. And because all transactions are written to the logs *and* the database files as soon as they happen, losing your log file drive will not cause you to lose any data either. If your logs drive fails, though, you may need to run a little bit of maintenance on the database files with ESEUTIL to put them into a clean state before they will mount properly. The logs are designed to provide “Point In Time” database recovery.
Point In Time Recovery
Point in Time Recovery is a function that allows you to restore a database to the state it was in at an exact point in time. For instance, lets say someone requests that you restore a mailbox that was deleted on Wednesday of two weeks ago at 2:14PM. For this situation, let’s assume you run full backups every Sunday and incremental backups every day. If you restore the mailbox from the backup taken before that Wednesday, you may be missing some mail. If you restore the database from the backup Wednesday night, you won’t get the mailbox. So what do you do? Well, you do a Point in Time Recovery. The way you do this is you restore the Database from the last full backup that was run before the point in time you want to restore to, then you restore all the log files between then and Wednesday night’s incremental backup. Once you have all the logs and database in a good location, you would create a RSG or Recovery Database that points to that location, and then look in the folder you saved the logs to. Each of the logs will have a timestamp on them that should carry over from the backup. This timestamp will allow you to pinpoint the log file that was written right before the mailbox was deleted. Once you find that, you delete every log that came after that, then mount the Database in Exchange. The database will go up to where the .CHK file that you restore says to, but it will stop at the last log file that is available below where the chk file says. So if the last log file available is the one written at 2:13PM on Wednesday, when the database finishes replaying the available logs, it will be in the exact state it was in when that last log file was written. And there you go, you have a database that has as much mail as possible in the deleted mailbox, which you can then restore normally.
Log Growth
One of the big problems that impacts Exchange servers is out of control Log growth. Logs are written constantly and there are only two ways they can be deleted. The proper way to delete log files is to perform a Full, Exchange Aware backup. If the backup software you use is not designed to perform Exchange Database backups, your logs will never ever get cleared and you will run out of drive space, which will force all databases with log files on the full drive to dismount and the Exchange server to explode (not really. It’ll just stop working). When you run a full backup that is Exchange Aware, the backup software instructs the Exchange system to “truncate” the logs. In older database systems, truncating the logs meant that the changes in the logs were written to the database and the files removed. These days changes to the database are written directly to the database, so when the system Truncates the logs, it basically just deletes them, but it does so in a way that allows the Database to stay operational.
The other option, deleting the log files manually, doesn’t work if the database the logs belong to is mounted. So you should always try to avoid deleting log files manually unless it’s an extreme emergency. And by Extreme Emergency I mean you haven’t run a full backup in a long time and have a completely full log file drive with about 300GB of logs. If you run into that situation, you pretty much *have* to delete the log files manually, since running a full backup on that many log files can take several days to complete, since the truncation process goes through each log file to make sure its changes were applied to the database. If the Database is dismounted, it is acceptable to delete log files, but you should only do so with the understanding that you will not be able to perform a Point in Time restore from the last backup to the point in time where the logs were deleted. (Point in Time recovery requests are fairly rare, from my experience, but they do happen, especially in larger companies with a lot of legal requirements).
Circular Logging
Now, if you are okay with not having the ability to do a Point In Time restore, you can configure Exchange to use a feature called Circular Logging. Circular Logging causes the Exchange server to retain only the latest 6 or 7 log files. Log files past that are automatically deleted, so you never have to deal with out of control log growth, and you also never have to run a full Exchange aware backup to clear log files. You would use this option if your backup solution doesn’t include support for Exchange server, if you don’t have a lot of space for logs, or if you just don’t care about dealing with logs for Point in Time restores. Another situation where you would use Circular Logging is if you have a Database Availability Group with at least three copies of each database. If you configure one copy to be Lagged (A lagged database copy waits a certain amount of time before writing transactions to the database), you can run Exchange in a No Backup mode. I’ll go into more detail on this feature in a later post, but for now, just understand that if you have enough database copies and at least one Lagged copy, you already have enough functionality to do Point in Time restores going back at most 14 days, and you are pretty well protected from Database failures.
Common Misconceptions
So now that I’ve explained how the logs work and what they do, let’s go over some common misconceptions about Transaction Logs:
- Transactions are only written to the logs and then the logs are written to the database – This misconception is due in part to how databases functioned in the early days. Nowadays, transactions are written to memory, disk, and logs at almost the exact same time. There is a little bit of lag time between them being written to log files and the database itself, but this lag time is so miniscule that it doesn’t really matter (fractions of a second).
- If I do a full backup every night, I can use circular logging – This is one of those sorta kinda maybe close to accurate things, but it’s mostly wrong because it ignores the primary purpose of log files, which is to bring a restored database up to the most recent possible state it was in when the original copy was destroyed. If you run full backups every night, you still need to make sure you’re keeping all the logs from that backup time to the next backup time, otherwise when you restore your backup you will be missing up to 24 hours worth of mail. If you’re okay with that limitation, then sure, use circular logging if you run daily full backups. Otherwise, keep circular logging off.
- Deleting the logs manually will corrupt the database – No, it won’t. As I mentioned, deleting the logs manually is sometimes necessary, and can be done at any time in more recent versions of Exchange. The danger in doing manual log purges is data loss. You never want to delete logs that haven’t been backed up (either a full backup or an incremental/differential backup). If you’ve cleared all your logs manually and the database dies, there is no way to recover any transactions from the logs that were deleted if the files themselves haven’t been backed up. A Full, Exchange aware backup will “truncate” the logs, which is geek speak for deleting all the log files created after backing them up. This is simply to free up space, because the transaction logs are no longer needed once they have been backed up.
Still a great article!
Thanks a million! Thanks for all your efforts!
Very informative article.
Just a query lets assume if I delete any log file from log folder and refresh the files by renaming the same file folder. Does files comes automatically including the deleted one ?
log files are not regenerated at any time, so the answer would be no. Once log files are deleted, they are gone for good unless restored from backup.
Thanks a lot for your response I appreciate that.
I still have copy of the logs however the backup was executed the same day but those are not included in backup only new the logs were backed up and truncated.
Can I leave those old logs as it is ?
Once a backup is run, the logs are not needed anymore. Old logs that fail to get cleared after a backup can be removed safely.
Thank you for the information.
I have just recently taken over an exchange 2010 server, I previous IT man didn’t do a vss backup, he just used a script to back up the DB.
I have for DBs plus the PFdb and the log files are on a separate partition.
I have installed windows server backup 2012 and started to backup the server but I am lost with how large the DBs as the users mailboxes do not total up – example- DB1 130GB used but uses mailboxes total 35GB. Now I have noticed that the circular logging is enabled which doesn’t help my DR but my question is, I am going to turn it off but with the logs stored on another drive when I run the backup does it clear down the DB also or does just the Logs drive? I just don’t get why so much disk space is used when the users mailboxes are so small.
Exchange doesn’t shrink the database when running a backup. The database file size often includes a lot of “White-space.” White-space is areas in the DB where objects were deleted. It’s usually used first, by default, but if new data that gets written to the DB doesn’t “fit” inside the whitespace, it gets appended to the end of the database. Aside from whitespace, there are other things like the deleted items folders, the dumpster, disconnected mailboxes, and integrated system data to consider. There isn’t a whole lot you can do about the DB being larger than the mailbox totals, but you can usually clean up whitespace by doing an “Offline defrag” of the DB. You can either use the automated method for that (using eseutil) or the manual method, which is to create a new mailbox database and move mailboxes to it (Offline defrag does this as well, but the manual method allows you to keep mailboxes available during the process). One thing that may be causing the huge discrepancy is that the database may be configured not to permanently delete items until a backup is run. If the previous IT guy didn’t run a VSS full backup, it wouldn’t have done a backup that meets the requirements of that configuration, so all items that have ever been deleted would still be stored in the DB somewhere. Once you run a VSS Full backup on the server, I expect there will be a very large amount of whitespace to deal with, but the DB would essentially stop growing.
Thank you for your reply.
I would love to make another DB but it is standard exchange and all five DBs are in use 🙁 I am running a vss full backup but not seeing any change in white space, I have cleared down old mailboxes which has freed up “available new mailboxes” on each DB, but the white space in ()isn’t much.
I could move the logs on to another drive then extend the DB drive with another 170GB space but that feels like I am just patching up and not fixing the problem.
Logs really should exist on a different drive, and extending the DB drive would probably help as well. The “Available new mailboxes” value doesn’t show all the whitespace in the database, and until a proper backup is completed there won’t be any whitespace. I’d recommend making sure a VSS full/Exchange aware backup is completed before doing anything else at this point. Once that’s done, you can move the mailboxes from one DB over to the other DBs, then delete the DB and create a replacement. Once that’s done, migrate the mailboxes back. It’s a lot more work, but should help you resolve everything.
Hi,
Thanks for the article, it was very informative. I have a question i hope you can answer for me.
We have a large Exchange 2010 setup, with a total of 25 servers, split between 2 datacenters. All 33 mailbox databases are in a single DAG with 3 copies, with 2 copies on 2 servers in one datacenter, and 1 copy on a server in the other datacenter. Transactionlogs for a database is placed on the same disk as the databasefile (decision taken before my time). Every 2 hours we do a full incremental backup using DPM 2012 R2 that trunks the logs.
We have never had to do a point-in-time restore of a database, and when a database copy has failed (disk error, corruption, etc) we have simple reseeded the copy from one of the other 2 copies (changing the disk if necessary).
We are in the process of preparing to migrate to Exchange 2016, and will keep the overall design (2 datacenters, 3 copies of all databases split between the datacenters) and during the planning we discussed whether or not to simply use circular logging on all databases?
As long as all 3 copies of a single database are not lost, we have trouble imagining a scenario where keeping (and trunking) the logfiles will be preferable to circular logging, especially as we have had some trouble with users with mobile devices generating a ton of logs (sometimes more than 5.000 an hour). Our SLA with the endusers says that we take daily backups, so the 2 hour backups are 1) just better service but not something known to the end users, and 2) in order to keep the number of logfiles down to a manageable number.
What we are asking is, given the above scenario, can we just do our normal 2 hour backup (or perhaps reduce the frequency) in order to have true disaster recovery (all 3 copies of a DB fail at the same time) and use circular logging? Would you still recommend a lagged database copy?
Thanks in advance 🙂
Microsoft’s recommendations on DAGs with three or more nodes are to have one copy lagged as much as possible to allow restoration in the event of database corruption that gets replicated to the other database copies. The lag will allow you to do point-in-time recovery if that happens. With that in place, MS also recommends switching to circular logging on the DBs, since there is no additional benefit to having the transaction logs with that setup.
A couple things I would recommend with this setup:
1. Make sure that the lagged copy has plenty of drive space available to it.
2. Configure deleted item retention to much longer than default (up to a year or more). Configuring long deleted item retention will increase database file growth, but makes recovery of emails *much* easier, since recovery can be done by simply running the search-mailbox cmdlet in powershell on Exchange.
Like you said in the opening, I’ve read many articles about transaction logs and none if them gave me clarity on how it actually works.
I really really appreciate it. Thank you for this.
This article is absolutely amazing! I have bookmarked it on our internal company Wiki for future/training references.
Thanks champ.
Very good article.
This is very informative article. however, i have one question about the problem which i am facing. i have 3 nodes exchange 2010 DAG(SP3). two nodes are on primary site and 1 node is on my DR Site. previously we were operating our DR site within city so huge number of logs copy on DR site was not an issue. recently, we have moved our DR site to different city (connected through WAN Link provided by Service Provider). now what is happening that our entire bandwidth is being consumed by exchange replication. if i suspend my replication on DR for 2 hours, so after 2 hours the size of Copy Que Length becomes around 2500 (2.5GB) which i know impossible as there is no such email send/receive during 2 hours.
Now my question is, why number transactions logs are growing exponentially? does it also records and increase size in trasnsaction log if:
1) If i move an email from one folder to another folder within my mailbox. like manually movement or through outlook rules. for example, if i receive an email of 1MB and then afterword i move it into another folder. would it be recorded as 2 actions and 1 MB for each action (email delivered, move in different folder)
2) does it also records in transaction files if i have enabled ourlook archiving. means
every email older than 10 days will be archived automatically into my local PST. now for example if 10MB of emails are moved from my mailbox to local PST. would i t also create 10 MB as transaction logs as these 10 emails have removed from mail mailbox.
Transaction logs store every command that is run against the exchange database, so the amount the logs grow depends on how much mail the environment deals with, both internally and externally. Sending a message from one user to another will cause that message to be stored in the logs, including any attachments. I don’t know the exact underlying code used to actually do this, but there is a good possibility that the logs will have each internal message show up twice or more, depending on the number of recipients. Now, once the messages are stored in the database, there’s a good bit of deduplication that occurs to reduce DB size, but this doesn’t occur in the transaction logs themselves.
In addition to mail transactions, maintenance windows will generate a lot of logs due to the number of move, repair, delete, and other commands run against the database. All of this has to be logged directly to ensure the database can be restored properly. It’s not uncommon for logs to take up a lot of data over time, and 2.5gb isn’t an insane amount of logs per day, depending on the number of users there are. I’ve seen exchange environments generate 30gb of logs on an average day.
As to your questions:
1. I’m not sure on this one, but my guess is that it would write the message to the logs 1 time when received, then the move command would re-write some of the message properties, which would be another transaction, but not near 1MB worth. Probably closer to 2-3KB.
2. The transaction logs will record everything you do, but deletes theoretically take less data in logs than receiving an email. In this example, the data writes to the PST file wouldn’t cause a transaction log change, since it’s a read (Transactions generally only include writes, changes, and deletions). However, if the data is subsequently deleted, it would generate a good bit of log info for each individual message deleted. For deletes, size doesn’t matter as much as number, so you’d see much more transactional data being written for 1000 1KB emails being deleted than for 10 1MB emails being deleted.
Sorry if I am late to the party, but what about a complete mailbox deletion. if I remove a 10GB mailbox, will that immediately generate 10GB of transaction logs?
I haven’t had a chance to verify, but theoretically a mailbox delete should only generate a very small transaction, since we’re basically just instructing the database to erase data it already knows about. That shouldn’t really result in a large amount of data that needs to be logged.
However, copying a new mailbox to a database will generate a significant amount of transactional data on the target DB, since the transaction will include all the data in the mailbox plus the transactional code used to add that data to the database. This is the main reason best practices dictate switching to Circular logging prior to performing large mailbox migrations.
Great article. There is one thing I’m wondering about:
How are the objects (message items and/or attachments) stored?
Are these also stored in the transaction log or is it just the information about the objects and the processes handling them that’s stored in the log files?
And if so, does it mean that every message an Exchange server receive, is available through the logs (and is it stored binary or in blob or otherwise)?
Transaction logs store information on every change made to the database. Specifically, it includes all information necessary to apply those changes again should the need arise. If an email is sent to the server, the transaction log will contain the full text of the message and any attachments. The logs are written in binary form, and cannot be directly read, but can be read with the proper tools.
Thank you for the explanation.
great article. this gives clear idea about translation logs in exchange . thanks mate
Great article, gives clear idea about various components behind the scenes.
Thanks.
If only you told us where these logs were located, I might like this. Told me everything I needed to know except where to find them.
The location isn’t always the same, since it is defined at database creation. You can usually get that info by running get-mailboxdatabase | fl
In the exchange shell and looking for the log file location there.
You could also compress a chunk of the log files, mount the database to get users back in and then run a full backup. This has got me out of a hole.
sorry typo’s in previous post meant this
great article , one question if my server did blow up on Friday and I restored email from my backups on Tuesday , how would having transaction logs help then i.e if the server blows up the trans logs are gone too ? So my backup wouldn’t contain the 3 days log files I’m missing ?
great article , one question if my server did blow up on Friday and I restored email from my backups on sat Tuesday , how would having transaction logs help them i.e if the server blows up the trans logs are gone too ? So my backup wouldn’t contain the 3 days log files I’m missing ?
That’s correct. If the transaction logs and database files are lost, there is no way to recover mail between the failure and last backup. If you do incremental backups, you’ll usually have transaction logs for each day, and you can usually restore those and get them applied to the database.
Very good and informative article. Opens up meaning of transaction logs very well. Thank you!