Last Update: 2017 - 07 - 09
Why DropBox, OneDrive and Offline Folders will corrupt your Access database
by Philipp Stiefel, originally published May 9th, 2017
Now and then I read questions in Microsoft Access Forums about how to share an Access database with other users on DropBox, OneDrive or a Windows offline network share. This is actually a pretty easy Question to answer.
The short and simple answer is: Do . not . do . that!
Well, if that answer is sufficient to you, then we’re done here. – You’re welcome to read some of the other articles on this site. ;-)
Oh, ok, …
…you’re still here. Then I won’t just leave you baffled with this simple answer. You are obviously interested in the reasons for this, so you should read on.
The problem boils down to corruption and data loss.
Basic synchronization logic
The synchronization logic of OneDrive- or DropBox-folders tries to sync the most recent version of any file found in its local offline copy location. The simplest indicator to find the most recent version of a file, is by its last modified date. This works well, if you are the only person working on any give file and if you always sync the file to the central (OneDrive-, DropBox, or File)-Server after you modified it in any location. That is quite simple and it will work with any type of file.
Advanced synchronization magic
Now, you might have noticed, that there are some applications that are able to handle even concurrent updates very well. OneNote is one noteworthy example of such an application. I’m always amazed how well OneNote handles this. I sometimes work with other people on one and the same OneNote-Notebook shared in OneDrive. When working online, changes made by someone else appear right in the page I’m just looking at almost in real time. That is so amazing. I can remember only one or two situations in the last 5 years where OneNote displayed an error message telling me it was unable to sync changes due to conflicts or whatever other problem.
However, you should be aware that this is not a secret magic built into file sharing protocol of OneDrive. The capability for this is built into the OneNote file format and into the different OneNote applications. So, this is an exception. It is not what you should expect from other applications as well.
Issues with Access files
Now that we know synchronization capabilities depend on the file format and on the application, let’s look at Access and its files.
The database file format
The Access file format was designed in a time long before anybody was thinking about offline file synchronization of any sort. Even if somebody had actually thought of it back then, he was wise to discard any ambitions to build offline support into the Access database file format. Any database greatly benefits from a dense, binary file format that requires as little disk IO as possible to read as much data as possible.
Also, database files differ from most other files in another aspect. A tiny change to a single record might result in many different changes to the actual database. In addition to updating the actual, modified record, there might be multiple indexes that need updating, and table storage allocations might need to be extended. – Changing a single letter in any record might trigger changes all over the whole database.
As if that was not complex enough already, think one step further. As most databases, Access uses transactions to write to the records in the database. Several changes to multiple records can be grouped into a single transaction. Meaning all those changes need to be applied as a whole or none of them must be applied at all. There is nothing in between. After the transaction completed, it is impossible to reconstruct which changes where done within which transaction. The database itself doesn’t need to know that any more, once it made sure the transaction completed successfully. – However, an external process that synchronizes those changes at a later time, would absolutely need to know the individual transactions to keep the data consistent if it would try to sync individual changes below the level of the whole file.
Replication – Is no more
Access used to have a replication feature to allow for distributed copies of one and the same database. That replication engine would keep track of all disconnected, offline changes to a specifically prepared copy of the database to apply them to a master-copy of the same database later. This feature basically worked, but it had a very high risk of synchronization conflicts and user error accidentally leading to lost data. The feature was finally removed when the database engine was changed from JET to ACE and the file format was changed from .MDB to .ACCDB with Access 2007.
I was looking for an in-depth article on Access Replication to link to here, but unfortunately all the authoritative information on replication at microsoft.com have been removed.
Automatic changes to the database file
Now, it should be clear already that you cannot expect the same level of synchronization from Access databases. That however, is only part of the problem. It gets worse.
With almost any other type of application the data files will be only written to, if you actually change something in the file and save your changes. So, any read-only access to the file will not affect its last modified date. Conflicts can only occur between different data writers, but there will never be any conflict between a data reader and a data writer.
That is different with Access. Even if you open an Access database and just look at the data without changing anything at all, the .accdb- or .mdb-file might be written to. If you execute any query, and that applies to a plain select query already, Access might actually write some temporary internal data, required to execute the query, to the database file.
If you’ve got any newly written VBA code in your Access file, which was saved but not yet compiled, it will get compiled once its execution is triggered by user action in the UI of the application. The compiled state of the code is automatically saved in the Access file and hence changing the file without you consciously modifying it.
The resulting dilemma
Now, combine the two facts elaborated above.
First, the Access file format is densely packed and does not allow any outside logic to figure out which changes to the file need to be grouped together and applied to another version of the same file to get both versions in sync.
Second, the file gets updated all the time with important information as data changes and with unimportant internal temporary data. It is impossible for any logic outside of the Access application to separate these different types of changes.
This creates a scenario in which it is absolutely impossible to reliably sync multiple, concurrent versions of such a database file to a consistent state. Any attempt to merge different concurrent changes is doomed to fail and will pretty likely corrupt the database.
How to work around the issues
It can work to put an Access database on a shared, offline location if you make absolutely sure, that there never will be any concurrent updates to the file. That includes to make sure any changes you made to an offline copy of a database are synced to the central server when you finished our work. All the same it requires you, to always make sure the synchronization process pulled the latest copy of a database to your local OneDrive-/DropBox-Folder before you open the file and start working on it. – In my opinion, this too error prone in a multi user scenario to even try.
I store some demo- and sample databases on a Windows offline network share. I’m the only person working on those files. So, this is indeed the sole scenario I would consider offline use of an Access databases at all. I always try to make sure there will be no competing updates to any one file, as described above. – Still, with those files I had the highest rate of corruption to the databases I ever experienced with Access in my whole career.
Just for comparison; usually, with a solid application and a stable network, I would expect corruption to happen less than once or twice a year to a database in daily concurrent use by a group of up to 20 users. With my private databases on an offline network share corruption in one form or the other happens about once every 30 hours working with them.
Sure, the cause of the corruption can be tracked down to user error in most of those cases. I simply forgot to make sure the file was completely synced before opening the database on another computer. – But, it is extremely hard to prevent this sort of user error and the resulting corruption!
The conclusion leads back to the initial simple answer to the question. Do not share any Access databases on OneDrive or DropBox with other users. Put database files there for your own, personal use only if you are aware of the risk and due diligence required. Even then, the bottom line remains: Do this only, if you absolutely have to.
© 1999 - 2017 by Philipp Stiefel