Last Update: 2024 - 05 - 21 |
Performance differences counting records in a Microsoft Access Databaseby Philipp Stiefel, originally published May 6th, 2016 I just stumbled upon a blog post of Microsoft Access MVP Daniel Pineault about different approaches to count the records in a table/query. In the process of optimizing a database created by another developer, he came across the following code to count records.
Dim rs As DAO.Recordset
Dim lRecCount As Long
Set rs = db.OpenRecordset("SELECT * FROM TableName;")
rs.MoveLast
lRecCount = rs.RecordCount
rs.Close
Set rs = Nothing
This is awful code! It is extremely inefficient because it forces the database engine to retrieve all the records in a table/query even though only the count is needed. The database engine should be able to query the number of records in a table much more efficiently. – I think this is so bad, I will ignore this approach completely. Daniel than compared the above code to variations of using the DCount-Function and a SELECT-COUNT()-Recordset. Here are his results: Surprising detailsI was surprised by those results for two reasons.
With a clear difference there showing in Daniel’s result, I was curious if there would have been a similar difference between SELECT-COUNT(pk) and SELECT-COUNT(*). Unfortunately this was not included in the results. So I felt the need to do a performance test myself. My performance testI used Microsoft Access 2013 for my test and the Northwind database template included with Access 2013. To get a decent table size I created a Cartesian query including the Customers table 4 times. Then I appended the results of that query to the existing customer data, thus creating 707,310 records in the Customers table. Then I wrote a couple of simple functions for my performance test. Here they are:
Public Function DCountId() As Long
DCountId = DCount("ID", "Customers")
End Function
Public Function DCountStar() As Long
DCountStar = DCount("*", "Customers")
End Function
Public Function SelectCountId() As Long
Dim rs As DAO.Recordset
Set rs = CurrentDb.OpenRecordset("SELECT COUNT(ID) FROM Customers")
SelectCountId = rs.Fields(0).Value
rs.Close
Set rs = Nothing
End Function
Public Function SelectCountStar() As Long
Dim rs As DAO.Recordset
Set rs = CurrentDb.OpenRecordset("SELECT COUNT(*) FROM Customers")
SelectCountStar = rs.Fields(0).Value
rs.Close
Set rs = Nothing
End Function
Public Sub TimedExecute(ByVal functionName As String)
Const ITERATIONS As Long = 1000
Dim i As Long
Dim startTime As Date
Dim discardedDummy As Variant
functionName = IIf(Right(functionName, 1) = ")", functionName, functionName & "()")
startTime = Now
For i = 1 To ITERATIONS
discardedDummy = Eval(functionName)
Next i
Debug.Print functionName & ": average time per iteration: "; CDbl(DateDiff("s", startTime, Now())) / CDbl(ITERATIONS) * 1000 & " (ms)"
End Sub
With this code in place, I ran all the test functions by using the TimedExecute method like this. Call TimedExecute("SelectCountStar")
First I ran the test with the database on my local disk (SSD RAID). – The results were useless. Flat 8 milliseconds for each of the functions. – No difference at all. Zero. None. But with a real world database, you should always consider your backend database being accessed over the network. - That’s where you sift the wheat from the chaff. – So I moved the backend to a network share (accessed via 100Mbit Ethernet) and re-ran my test procs. Now we’re talking. There are quite noticeable performance differences between the approaches. Here are the results in detail.
While the tabular results are pretty comprehensible already, here is a visual chart representation as well. ConclusionThese results are absolutely in line with Daniel’s findings. There is only a tiny difference between the DCount(*)-approach and SELECT COUNT(pk). I guess Microsoft has made significant performance improvements to the DCount-Function since the days of Access 97. Hence we are not seeing the massive differences I would have expected. However, the first interesting comparison is between DCount(*) and SELECT COUNT(*). As you can see, DCount(*) is 52% slower than the Recordset approach. That is a very clear win for the Recordset! J But let’s keep this in perspective here. While there is a clear performance advantage to the Recordset, it still is only 11 milliseconds faster. If you are using a DCount here and there and not calling DCount hundreds or thousands of times in a loop, I suggest you continue to use whatever you like best. If you are not simply counting all records in a table, it is much more important to write an efficient WHERE-Condition for your SELECT-Query or the DCount-Function. Then make sure there is a useful index on the columns of the table, which are used in the WHERE-Condition. My results confirm the performance difference between SELECT COUNT(pk) and SELECT COUNT(*). I still am curious why this difference exists and I still have no explanation for it. We could find out by using the JetShowPlan Registry Setting to force the Jet/ACE-Engine to expose its execution plans of these queries. – But that will have to wait for another day.
I will never share your email with anyone. You can unsubscribe any time. © 1999 - 2024 by Philipp Stiefel - Privacy Policiy |