Language: Deutsch English















Last Update: 2024 - 05 - 21








Microsoft Access - ODBC Linked Tables – Mechanisms and Performance

by Philipp Stiefel, originally published 2021-12-27

last revision on 2021-01-27


Article header image - Access ODBC Linked Tables

Based on a photo by Pexels

This text is the revised, expanded, and translated rewritten version of a posting in the newsgroups de.comp.datenbanken.ms-access and microsoft.public.de.access.clientserver originally posted December 23, 2001. – Despite it being 19 years old as of today, the content of this text is still relevant today (2021).

The original reason for this article was the comparison between an ADP (Access Data Project, now deprecated) and the traditional method of linking tables of a server DBMS via ODBC in an Microsoft Access MDB/ACCDB database. I don’t deal with ADPs in this article, but the processing and performance of queries in an ADP is essentially identical in this context to the pass-through queries described here.

For this text I assume that all tables are in a Microsoft SQL server database and linked to an Access frontend via ODBC. The basic mechanisms are not specific to Microsoft SQL Server but are the same for any ODBC linked DBMS database, like Oracle, MySQL, or PostgreSQL.

RecordsetType: Dynaset vs. Snapshot

Before we look at some query examples, we need to take a step back and understand how the different RecordsetTypes for queries/recordsets in Access work with ODBC linked tables.

By default, Microsoft Access opens the recordset from a query as a writeable Dynaset recordset (a keyset cursor). This is a multistep operation. The first step is to retrieve the primary key values of the records of the result set. The next step(s) is/are to retrieve the values of all the other columns requested by the query. This is done using a select query as prepared statement (practically a kind of stored procedure) on SQL Server to query the data by the primary key. To optimize the communication between Access and SQL Server, this is usually done in batches of 10 records. – I described in more detail how this is supposed to work and how it sometimes doesn’t, in my text on a bug with ODBC Multi-Row-Fetch.

The other available option the is the read-only Snapshot recordset. A query to create a Snapshot will send one (or more; see below) SQL statement to directly retrieve all the values of the columns requested by the query.

RecordsetType property in Access Query Design View

Queries on a Single Table

A query "SELECT column1, column2 FROM table WHERE column1 = criterion" is either executed unchanged (Snapshot) or translated into "SELECT PrimaryKey FROM table WHERE column1 = criterion" and then, as described above, the other column values are retrieved with subsequent queries (Dynaset). In both cases, the main processing of the query is done on SQL Server.

Only for a query applying a function to one of the columns in the where condition, like "SELECT * FROM table WHERE AccessOrVbaFunction (column1) = criterion", all the records of the table are returned to the Access frontend and the query is essentially processed by Access locally. Here also the further processing will follow the rules for Dynaset or Snapshot.

The only exception to this rule are functions that are included in the in the list of supported Scalar functions in the ODBC specification. These functions will be passed on to backend server using the {fn FunctionName()} Syntax and then either translated to the corresponding function in the backend DBMS or, if supported evaluated directly. The exact behavior depends on the backend DBMS, the ODBC driver and their versions.

Even if the function call can be passed on to SQL Server, the query will still suffer a performance hit due to the function being invoked on a column, as discussed in Query Performance Tuning – Basic Universal Rules.

Queries with JOINs on Two Tables

In the case of JOINS over 2 tables, the procedure is analogous to the queries on a single table, but the result data columns for Dynasets are retrieved individually from each table and then merged by Access.

The following example illustrates this:

SELECT t1.column1, t1.column2, t2.column1, t2.column2 FROM table1 AS t1 INNER JOIN table2 AS t2 ON t1.column1 = t2.column1 WHERE t1.column1 <50;

is translated to

SELECT t1.PrimaryKey, t2.PrimaryKey FROM table1 AS t1 INNER JOIN table2 AS t2 ON t1.column1 = t2.column1 WHERE t1.column1 <50;

The column data you were querying for will then be retrieved from both tables in separate operations.

SELECT t1.column1, t1.column2 FROM table1 WHERE PrimaryKey = PkValueCriterion;

and

SELECT t2.column1, t2.column2 FROM table2 WHERE PrimaryKey = PkValueCriterion;

They data is then merge by Access in a result set for display on the screen. This does not happen directly by executing the above-mentioned SQL statements, but these statements are called again as prepared statements.

A Snapshot will, as usual, be retrieved in a single operation.

So far so good. With simple queries that can be translated 1:1, the performance of linked tables is very good and can hardly be distinguished from direct queries to the server, even despite the additional communication overhead of the Dynaset fetches.

What About Updates?

The mechanisms for updating records are out-of-scope for this text. Nonetheless, a quick info.

When a record in a query based on multiple ODBC tables is updated in Access, it will send multiple update statements (one for each updated table) to SQL Server. The update statements will be wrapped into one transaction together to ensure data consistency.

JOINs on More Than Two Tables

Avoid excessive optimism in favor of the linked tables. While they are only small for simple queries, one should be aware that the performance disadvantage of queries on linked tables increases the more complex the queries become. The drop in performance of such queries is not linear, but exponential.

Even with a query on just three tables in Access-Join-Syntax (the ugly nested joins, Access doesn't understand anything else) combined with a where condition on only one table, the catastrophe can be imminent already. – Particularly a mix of Left and Right Joins, which the visual Access query designer is very prone to generate, will quickly trigger this problem.

When processing the query, Access first sends the inner nested part of the query to the server. - Without including any criteria. Then a second SQL statement, representing the outer part of the query, is sent to SQL Server. – Again, without any criteria. – Only then, after the results are transferred to Access, the criteria are applied locally.

Access Query SQL analyzed in SQL Server Profiler

For clarification, let’s look at a SQL statements of the query illustrating the above. The following query is executed in Access (the statement was created by the Access Query Editor and only reformatted for better readability):

SELECT table1.column1, table2.column2, table3.column3 FROM (table2 RIGHT JOIN table1 ON table2.column1 = table1.column1) LEFT JOIN table 3 ON table2.column2 = table3.column2 WHERE (((table1.columnXY) = Criterion));

When Access sends this query to SQL Server, it breaks it down into two partial base queries. This is the same for Dynasets and Snapshots, only the retrieved columns vary depending on Dynaset/Snapshot.

SELECT table2.PrimaryKey, table3.PrimaryKey FROM table2 LEFT OUTER JOIN table3 ON (table2.column2 = table3.column2); SELECT table1.PrimaryKey FROM table1;

The execution of these partial queries can be devastating with larger databases, because a huge number of results is retrieved, of which only a tiny fraction may be actually required.

With the sample data I used, the result set of the first partial query would be around 50 million records! The whole query times out. When disabling timeouts, it would probably take more than 45 minutes to get all results. – But I don't want to wait that long.

To put things into perspective, the behavior of queries on 3 or more tables described here is a worst-case scenario that does not have to occur in principle to the extent described. But the more complex an application with Linked Tables in Access is, the greater the likelihood that serious performance problems will arise that can be attributed to this behavior.

The example code above was originally written for the original version of this article in 2001 and verified with the software versions of Access, SQL Server, and ODBC driver current back then. Today (2021) I cannot reproduce the horrible performance problems of the 3-table-join with currently supported software versions.

However, this does not all mean that the issues discussed here are no longer relevant. Just this week, I recorded a video on how to use SQL Server Profiler to analyze these types of issues. In that video, I show a similar issue with equally bad effects on performance.

Solutions to Performance Issues due to SQL Translation/Splitting

There are two common solutions to the problem discussed above.

Pass-Through-Queries

A Pass-Through-Query is a special type of query in Access. The distinctive trait of Pass-Through-Queries is that they are not processed by the JET/ACE database engine but ‘passed through’ directly to the backend database server via ODBC. The main drawback of Pass-Through-Queries is that they are read-only by definition. If you want to update data from such a query, you cannot use built-in data binding in Access but need to write the complete update logic yourself.

To compare the performance of a Pass-Through-Query, I manually wrote one with a SQL statement, returning the same results as the 3-table-join above as follows:

SELECT table1.column1, table2.column2, table3.column3 FROM table1 AS t1 LEFT OUTER JOIN table2 AS t2 ON t2.column1 = t1.column1 LEFT OUTER JOIN table3 AS t3 ON t2.column2 = t3.column2 WHERE t1.columnXY = Criterion;

Now, this pass-through-query returns the correct result of 46 data records out of 50 million after approx. 15 seconds (compared to ~45 minutes)!

Linked Views

A view is the definition of a select query stored in the backend database on SQL Server, similar to a stored select query in Access. You can create such a view and then link it to the Access frontend as if it was a table. To avoid creating a huge number of views for every possible query, it is advisable to create views only with the definition of the table relations (the joins) but not to include any criteria und less it is required to represent the relation.

For the JET/ACE engine there is no difference between an ODBC linked tabled or a view. For the query above it means Access does not know that the linked view is actually a query on three tables. So, when passing the query SQL to SQL Server it is mostly left unchanged as we have seen in the single table select query. Any criteria used in an Access on the view will also be passed to SQL Server and thus evaluated there preventing the performance issues described above.

Conclusion

A client-server application with linked ODBC tables in Microsoft Access can be implemented. But under no circumstances should one indulge in the illusion that one can simply link the tables of the server database to Access and then work in Access exactly as one is used to from pure Access applications. A whole bunch of optimizations is necessary, and weaknesses of the JET/ACE database Engine must be bypassed if you want to create a high-performance application with server DBMS backend.

Share this article: Share on Facebook Tweet Share on LinkedIn Share on XING

Subscribe to my newsletter

*

I will never share your email with anyone. You can unsubscribe any time.
This email list is hosted at Mailchimp in the United States. See our privacy policy for further details.

Benefits of the newsletter subscription



© 1999 - 2024 by Philipp Stiefel - Privacy Policiy