Last Update: 2018 - 02 - 27
Performance differences counting records in a Microsoft Access Database
by Philipp Stiefel, originally published May 6th, 2016
I just stumbled upon a blog post of Microsoft Access MVP Daniel Pineault about different approaches to count the records in a table/query. In the process of optimizing a database created by another developer, he came across the following code to count records.
Dim rs As DAO.Recordset Dim lRecCount As Long Set rs = db.OpenRecordset("SELECT * FROM TableName;") rs.MoveLast lRecCount = rs.RecordCount rs.Close Set rs = Nothing
This is awful code! It is extremely inefficient because it forces the database engine to retrieve all the records in a table/query even though only the count is needed. The database engine should be able to query the number of records in a table much more efficiently. – I think this is so bad, I will ignore this approach completely.
Daniel than compared the above code to variations of using the DCount-Function and a SELECT-COUNT()-Recordset. Here are his results:
I was surprised by those results for two reasons.
With a clear difference there showing in Daniel’s result, I was curious if there would have been a similar difference between SELECT-COUNT(pk) and SELECT-COUNT(*). Unfortunately this was not included in the results.
So I felt the need to do a performance test myself.
My performance test
I used Microsoft Access 2013 for my test and the Northwind database template included with Access 2013. To get a decent table size I created a Cartesian query including the Customers table 4 times. Then I appended the results of that query to the existing customer data, thus creating 707,310 records in the Customers table.
Then I wrote a couple of simple functions for my performance test. Here they are:
Public Function DCountId() As Long DCountId = DCount("ID", "Customers") End Function Public Function DCountStar() As Long DCountStar = DCount("*", "Customers") End Function Public Function SelectCountId() As Long Dim rs As DAO.Recordset Set rs = CurrentDb.OpenRecordset("SELECT COUNT(ID) FROM Customers") SelectCountId = rs.Fields(0).Value rs.Close Set rs = Nothing End Function Public Function SelectCountStar() As Long Dim rs As DAO.Recordset Set rs = CurrentDb.OpenRecordset("SELECT COUNT(*) FROM Customers") SelectCountStar = rs.Fields(0).Value rs.Close Set rs = Nothing End Function Public Sub TimedExecute(ByVal functionName As String) Const ITERATIONS As Long = 1000 Dim i As Long Dim startTime As Date Dim discardedDummy As Variant functionName = IIf(Right(functionName, 1) = ")", functionName, functionName & "()") startTime = Now For i = 1 To ITERATIONS discardedDummy = Eval(functionName) Next i Debug.Print functionName & ": average time per iteration: "; CDbl(DateDiff("s", startTime, Now())) / CDbl(ITERATIONS) * 1000 & " (ms)" End Sub
With this code in place, I ran all the test functions by using the TimedExecute method like this.
First I ran the test with the database on my local disk (SSD RAID). – The results were useless. Flat 8 milliseconds for each of the functions. – No difference at all. Zero. None.
But with a real world database, you should always consider your backend database being accessed over the network. - That’s where you sift the wheat from the chaff. – So I moved the backend to a network share (accessed via 100Mbit Ethernet) and re-ran my test procs.
Now we’re talking. There are quite noticeable performance differences between the approaches. Here are the results in detail.
While the tabular results are pretty comprehensible already, here is a visual chart representation as well.
These results are absolutely in line with Daniel’s findings. There is only a tiny difference between the DCount(*)-approach and SELECT COUNT(pk). I guess Microsoft has made significant performance improvements to the DCount-Function since the days of Access 97. Hence we are not seeing the massive differences I would have expected.
However, the first interesting comparison is between DCount(*) and SELECT COUNT(*). As you can see, DCount(*) is 52% slower than the Recordset approach. That is a very clear win for the Recordset! J
But let’s keep this in perspective here. While there is a clear performance advantage to the Recordset, it still is only 11 milliseconds faster. If you are using a DCount here and there and not calling DCount hundreds or thousands of times in a loop, I suggest you continue to use whatever you like best.
If you are not simply counting all records in a table, it is much more important to write an efficient WHERE-Condition for your SELECT-Query or the DCount-Function. Then make sure there is a useful index on the columns of the table, which are used in the WHERE-Condition.
My results confirm the performance difference between SELECT COUNT(pk) and SELECT COUNT(*). I still am curious why this difference exists and I still have no explanation for it.
We could find out by using the JetShowPlan Registry Setting to force the Jet/ACE-Engine to expose its execution plans of these queries. – But that will have to wait for another day.
© 1999 - 2017 by Philipp Stiefel