Last Update: 2018 - 02 - 27
Use logical numerals sorting in Access with VBA
by Philipp Stiefel, originally published September 19th, 2016
You may ask what logical numerals sorting is. – Let me explain.
You, as a developer, (should) know the difference between a numeric data type and a text data type. If you store numeric data as text, it will be sorted differently than a real numeric data type. Here’s quick reminder of the difference in sorting.
We as developers understand the ramifications of data types and would consider the above quite logical. – However, if we present our users numbers sorted as in the Text data type column, they will think we lost our mind.
For the simple example above, the solution is as simple. We use a numeric data type and everyone will be happy.
But what about this column describing packaging units? (Both columns contain exactly the same data, but are sorted differently)
The two columns contain exactly the same data, but are sorted differently. What sort option would you prefer?
I would be much surprised if anyone would actually prefer to see this table sorted as in option A. But that is what will happen if you store the data in a Text column in Access and just sort by this column. Ugly! You could remedy this by sorting by a hidden column that orders the item as in option B. – But how is that hidden column maintained?
Wouldn’t it be great if you could sort the text as it is, without additional helper-columns, but just by using a special sort algorithm?
Recently someone asked this question about sorting on the Access-o-Mania Forum (German language). I found this question most intriguing, so I did some research on it.
The Windows API has it all
With Windows XP Microsoft introduced natural numerals sorting for files in Windows Explorer. So I thought there might be a Windows API function to achieve that. First search result was the StrCmpLogicalW function, which implements the numerical algorithm and considers digits by their numerical value. Unfortunately, this function can only be used to compare one string with another.
Much more suitable is the LCMapStringEx function, which can create a SortKey that can be used to sort multiple values, like those in a database table, based on that algorithm.
LCMapStringEx is not easiest of API declaration, but I managed to get it to work. Here is my implementation of GetSortKey. A VBA function that wraps this API to retrieve the SortKey with the SORT_DIGITSASNUMBERS flag. The resulting SortKey is a Byte array that is created using the natural numerals sorting algorithm.
Note: The SORT_DIGITSASNUMBERS flag is only available in Windows 7 and newer!
Private Const SORT_DIGITSASNUMBERS As Long = &H8 ' use digits as numbers sort method Private Const LCMAP_SORTKEY As Long = &H400 ' WC sort key (normalize) Private Const LOCALE_NAME_USER_DEFAULT As String = vbNullString Private Const LOCALE_NAME_INVARIANT As String = "" Private Const LOCALE_NAME_SYSTEM_DEFAULT As String = "!x-sys-default-locale" Private Declare PtrSafe Function LCMapStringEx Lib "Kernel32.dll" _ (ByVal lpLocaleName As LongPtr, _ ByVal dwMapFlags As Long, _ ByVal lpSrcStr As LongPtr, _ ByVal cchSrc As Long, _ ByVal lpDestStr As LongPtr, _ ByVal cchDest As Long, _ ByVal lpVersionInformation As Long, _ ByVal lpReserved As Long, _ ByVal sortHandle As Long) As Long Public Function GetSortKey(ByVal inputString As String) As Byte() Dim apiRetVal As Long Dim retVal() As Byte Dim flags As Long flags = LCMAP_SORTKEY Or SORT_DIGITSASNUMBERS If Len(inputString) > 0 Then ' call LCMapStringEx with empty buffer to determine the required size of the buffer apiRetVal = LCMapStringEx(StrPtr(LOCALE_NAME_SYSTEM_DEFAULT), _ flags, _ StrPtr(inputString), _ Len(inputString), _ 0, 0, 0, 0, 0) If apiRetVal > 0 Then ReDim retVal(apiRetVal - 1) ' Resize buffer for SortKey apiRetVal = LCMapStringEx(StrPtr(LOCALE_NAME_SYSTEM_DEFAULT), _ flags, _ StrPtr(inputString), _ Len(inputString), _ VarPtr(retVal(0)), _ UBound(retVal) + 1, _ 0, 0, 0) ' The api writes directly to retVal buffer, we only need to check for an error If apiRetVal < 0 Then throwError Err.LastDllError, "GetSortKey" End If Else throwError Err.LastDllError, "GetSortKey" End If Else ReDim retVal(0) ' if input="" return a single zero byte End If GetSortKey = retVal End Function
I will not dissect the details of the API nor of my wrapper function to call it. I suggest you look up the details in the documentation of LCMapStringEx.
Just one thing that might look odd from a VBA perspective. If you look at the constants LOCALE_NAME_USER_DEFAULT and LOCALE_NAME_INVARIANT you might think the values vbNullString and “” are effectively the same. – They would be in a pure VBA context, but not with the Windows API. vbNullString is a null pointer while “” has an actual pointer value that is referring to an empty string.
So far so good. – The function works, but for now you have to take my word for it, as we face another problem. The SortKey created by LCMapStringEx is a Byte array. What are we going to do with it?
1. Approach - Treat binary SortKey as string
In Access only functions returning simple data types as strings or numeric types are allowed within a query. Functions with complex return values, e.g. Types/Structs, Objects and Arrays are not allowed.
So the most common approach for problems like this, would be to write a VBA function that returns a string from the SortKey bytes. Use that function in a query, supplying the base column as input and then sort the query by the return value of the function.
Public Function GetSortKeyRawString(ByVal inputString As String) As String GetSortKeyRawString = StrConv(GetSortKey(inputString), vbUnicode) End Function
This approach is not ideal in view of performance, but it would work. - Usually. – Not here.
It looks straight forward at first. It is possible to convert a Byte array straight to a Unicode string. The string can be returned by your VBA function and you can use it to sort the results of your query. Simple. – Except that is does not work.
The byte array returned by LCMapStringEx is a true binary byte array. If it is transformed to string by your function, Access will treat it accordingly. That means when sorting this string, Access will apply the database sort order for text.
A short example of the problem. Treated as text, the letters a A b B c C are sorted correctly here. (Details depending on the used sort order.) But treated as binary data, the correct sort order would be: A B C a b c. - Totally different. To use the SortKey created by LCMapStringEx, we absolutely need to treat it as binary data.
So this does not work for sorting. Hence it does not solve our main problem. This approach can be used to update the SortKey data with an update query. The data in the Unicode string is actually correct, it is just not sorted the way we require it.
2. Approach – Store binary SortKey in Access
OK, we’ve got binary data and we need to sort it as binary data. Once more it turns out, something that should be trivial is not in Access. The well-known data type for binary data in Access is OLE-Object. But that is meant for large binary files and hence is not sortable by design.
Luckily someone reminded me of the Binary data type in Access. It is perfectly suited for the task at hand. However, you need some background knowledge to use it in Access. I compiled the facts in this article on the Binary data type.
So we can solve our problem by creating a new Binary column in the table we want to use numerals sorting on. Then we write a small VBA procedure to update all the records in the table with their SortKeys. Finally, we sort the table by the SortKey-column and, voila, we’ve got numerals sorting implemented.
Public Sub UpdateTableWithSortKey(ByVal tableName As String, ByVal baseColumnName As String, ByVal sortKeyColumnName As String) Dim rs As DAO.Recordset Dim db As DAO.Database Set db = CurrentDb Set rs = db.OpenRecordset("SELECT " & baseColumnName & ", " & sortKeyColumnName & " FROM " & tableName) Do Until rs.EOF rs.Edit rs.Fields(sortKeyColumnName).value = GetSortKey(Nz(rs.Fields(baseColumnName).value, "")) rs.Update rs.MoveNext Loop rs.Close Set rs = Nothing Set db = Nothing End Sub
Here is a screenshot of the table sorted by our additional SortKey column.
Note: If you store the SortKey in the database, you need to consider carefully which LOCALE should be used with LCMapStringEx. I would not recommend to use the user default locale, as this could easily be different for each user.
This approach works; but there is an obvious flaw: Update Anomalies
We now created a new field for our SortKey. The content of this field is totally dependent on the primary text field containing the data. If the text data is updated, the Binary field containing the SortKey needs to be updated as well.
So you either need to add code to update the SortKey in every form where the text data can be updated or you need to update the whole table before it is displayed in a sorted manner.
The first approach fails if the data is updated in an update query, the second approach used a lot of computation of which most of it is unnecessary. – Both are not ideal solutions.
Maybe a DataMacro could solve this issue to keep the SortKey in sync with the text data in an Access Backend database. – At this moment I do not know enough of DataMacros to elaborate further on this. I might be coming back to catch up on this later.
In addition to the update anomalies, this approach has another disadvantage. You need to extend the data structure of every table where numerals sorting will be used. On the plus side, once the SortKey is generated, sorting the data is very fast and can be accelerated even further by indexing the SortKey column in the table.
3. Approach – Hex-String function
We know that Access can use only functions returning strings in queries. Converting the raw binary data to string does not work because of the sort issues. What if we convert the binary data to a string representation that will be sorted correctly? – Sounds promising.
The most compact form to represent binary data in a string is hexadecimal notation. So I wrote a function to convert the binary data to hexadecimal.
Private Function GetHexString(buffer() As Byte) As String Dim retVal As String Dim i As Long For i = 0 To UBound(buffer) retVal = retVal & Format(Hex(buffer(i)), "00") Next i GetHexString = retVal End Function Public Function GetSortKeyHexString(ByVal inputString As String) As String GetSortKeyHexString = GetHexString(GetSortKey(inputString)) End Function
To make sorting work properly, we need to add a leading zero, if the byte value is only a single hex-digit. This creates a uniform output length per byte.
Now you can use this function in any query to create a virtual sort column. Sorting by this will produce the correct order of records, as the text based sort order corresponds to the binary sort order now. – This approach works pretty well!
Here is the SQL of such a query:
SELECT pu.PUId, pu.PUDesc, GetSortKeyHexString(pu.PUDesc) AS SortKey FROM tblPackagingUnits pu ORDER BY GetSortKeyHexString(pu.PUDesc) ;
And here is the result:
But be aware that this function is not ideal in respect of performance. The string processing is fairly slow and needs to process all records in any given result to perform the sorting. So this function should only be used on a fairly small number of records.
Conclusion and Thanks
Logical numerals sorting is a powerful feature to enhance the usability of your application. You should definitely be aware of the possibility to integrate it into you Access project.
None of the solutions presented in this article is the silver bullet to all situations where you might want to use numerical sorting. You should carefully analyze your requirements before you chose one or more the approaches outlined here.
Finally, I want to thank daolix and Lachtaube for their invaluable input into the discussion on Access-O-Mania. All the information in this article is rather the fruit of our joint effort than mine alone.
Dear reader, thank you for your attention.
Download a module containing all the VBA code from this article: modNumeralsSorting
© 1999 - 2017 by Philipp Stiefel