I get this question often from various levels of technical people. They typically have experience with Exchange servers, and they know that after a successful backup Exchange will truncate or cleanup it’s database log files. They mistakenly think that SQL will do the same. Let’s explore why each of these takes place and what is going on here.
Historically Exchange and SQL share some commonality in terms of their database heritage. Both are based on a JET engine database that Microsoft developed way back in 1992. Microsoft further developed that technology into two forks, which became known as JET Red and JET Blue. The JET Red technology was used in products like Microsoft Access and later SQL server, whilst the JET Blue technology was used in Exchange server. It’s worthwhile noting that this same technology is used within other databases on your Windows server including Active Directory, DHCP, and Certificate Services. If you want to know more about the Microsoft JET database engine, check out this article – https://en.wikipedia.org/wiki/Microsoft_Jet_Database_Engine
Key points to note between the two JET engines are that they were designed for different purposes in mind. They were designed to hold and structure data entirely differently and are optimised for those specific tasks. Exchange for instance is designed to hold a range of emails that vary in size and content. Therefore it’s database is always expanding automatically as new data is added to it unless there is free space within it. SQL databases however only expand if requested to do so, or via settings on the SQL database itself.
Exchange Databases are made up of a few specific files. The key ones (for this explanation) being the Exchange Database file itself – i.e. the EDB file and the other the Exchange Database log files, which are a heap of small files.
What happens when Exchange writes data is that it is first written to the log file and then to the database. When the next piece of data comes in, it adds it to that log file until the log file is full, and then starts a new log file. The Exchange log files are many and they are always the same size for a given version of exchange typically 1MB since Exchange 2010 onwards. Below you can see an idea of Exchange Database Log files
Now Exchange Databases can be configured in one of two modes depending on the fault tolerance requirements you have. These are Circular Logging or Full Logging modes.
In Circular logging mode, Exchange writes to sequentially numbered log files with any new data, and then constantly overwrites the oldest log file. It only ever keeps 10 log files in this mode. In this way, you will never fill up the disk with log files as you will only ever have 10 of them at a given point in time.
In full logging mode, Exchange writes to sequentially numbered log files with any new data but NEVER overwrites the old files. If keeps increasing the log file number forever. If you do nothing, you will eventually fill up the disk on which the log files reside and at that point Exchange will come to a stop and dismount the database associated with those log files.
The point of the two logging modes is for recoverability. In Circular logging mode, if you ever have corruption, then your only option is to restore the database back to the last backup and that is the total amount of data you will have – ie whatever is in that last backup is everything.
In Full Logging mode, you can restore the database from the last backup and then replay all the log files since that backup into the database which will ensure that no information is lost, and all will be recovered.
SQL Databases are slightly different story. They are made of normally of two key files, a database file and a log file. The database file is a .mdf file and the log file is a single .ldf file. Now – I say normally as there are ways to split databases across multiple files etc, but I’m not going to go into that for this explanation.
The SQL database file and log file are created when the database is created itself. The sizes of these files are set at initial values on creation. There are database specific options that will allow you to autoincrement the size of the database and the log file as well, but if you do not set these then the database and it’s log file will not grow beyond what you have specified.
Some SQL versions have limits on the size of their database files and they are dependant on the version and license applied to that SQL installation.
SQL Server Databases have three logging modes, again designed for whatever level of recoverability you need. These modes are called Simple, Bulk-logged and Full logging.
Simple Logging mode is just that – very simple indeed. Data to be written to the SQL database is first written to the SQL Log file, and then written to the SQL database. Once SQL confirms it has written the data to the database file, it is then deleted from the SQL log file. This means that the SQL log file in these instances contains relatively little data at a given point in time.
Full Recovery Mode is designed for higher levels of fault tolerance. When data to be written to the SQL database is first written to the SQL Log file, and then written to the SQL database. Once SQL confirms it has written the data to the database file, it is then marked as inactive within the SQL log file. It is never deleted from the SQL log file unless specific SQL commands are given to do so.
Bulk-logged recovery mode is not highly used and not relevant for this post, so I’m going to pretend it does not exist for now.
Now – we can see if a SQL database is in Simple mode, then it will have relatively small SQL log files and they will be kept under control by SQL itself. If we have SQL Full logging enabled on a database then we need to do something to keep the transaction log file under control.
Now we have a brief understanding of the differences between SQL and Exchange databases and their logging, let’s talk about backup.
In a Windows server environment there has been for many years now the ability for it to backup a file that is constantly open and being accessed. This method is due to a framework of providers and drivers called the VSS Framework. In a nutshell, VSS works like this.
- The VSS aware backup program sends a request to the VSS framework to do a backup of specific volumes
- The VSS framework then talks via the VSS writers to the individual applications and tells them to finish existing transactions and then pause new transactions for a moment.
- VSS then takes a snapshot of the drives in question (effectively they are in a frozen state for a very short period of time)
- VSS then responds back to the backup program to say “go for it” and it then backs up that snapshot of the drive in question as there are no files open at that moment in time.
- Once the backup is complete, the backup program tells VSS it is done and VSS removes that snapshot from the system.
This is basically how the magic works under the covers to get a clean backup of files. The time the system is frozen for a snapshot to be taken can be from less than a second to multiple seconds, or in extreme cases for minutes if a system is extremely badly configured. During this moment in time, users tend NOT to notice anything going on, so from their perspective life is awesome.
If you want to know more about how VSS works – check out this article from BackupAssist here https://www.backupassist.com/support/en/references/volume-shadow-copy.htm
Now we have an overview of how the backup process works, how does that apply to my Exchange and SQL databases.
Exchange Database Backups
With Exchange, when you have full logging enabled, and you are using a VSS aware backup program, at the successful completion of the backup, it will automatically delete all the Exchange log files that were there at the time of the VSS snapshot. This will mean that it will effectively keep the Exchange log files under control, provided you have regular, successful VSS aware backups.
With an Exchange server where you have Circular logging enabled, and you do a VSS aware backup, then effectively nothing happens to the log files as Exchange is already handling them being overwritten as part of the circular logging process.
SQL Database Backups
Now with SQL server there are some major differences. You will have a VSS writer for SQL server, but it is designed to handle things differently. The VSS writer for SQL server does absolutely NOTHING to the SQL log file when you have successful VSS backups. It won’t truncate them, or purge them or anything because effectively the data within the log file is within the SINGLE large log file for a given database. You will however have a copy of the SQL MDF and LDF at the point in time of the backups which could be used to restore the system if a failure occurred, however the granularity of that would be limited to the backup time itself.
SQL has within itself it’s own method of producing backups for use in a recovery situation. There are specific SQL commands that are used to do a backup of both the MDF and the LDF to a flat file that would typically reside on another volume of a server. Due to the high transactional nature of a SQL server, you might well want to perform a daily SQL backup of the database itself and then hourly backups of the SQL log files. If you did this then you would be able to recover using the daily backup as well as any number of the hourly log file backups.
In order to do a SQL backup AND ensure the SQL log file is truncated, we need to run some specific commands from within SQL itself. You need to create a scheduled SQL backup job that as part of it’s process will handle the backup of the SQL database, and then truncate the log file. This will effectively delete the log file entries in the log file that have been backed up and will therefore leave space for new logs to be added. Note – that this is only needed if the SQL database is in Full recovery mode. If it’s in Simple mode than it will handle itself automatically.
Hopefully this has helped you understand the differences and common misconceptions that exist when backing up Exchange and SQL databases. With proper understanding you will be able to handle these situations without worrying about running out of disk space etc.
If you’ve found this interesting, please feel free to share it, comment on it etc. Love hearing feedback.