Linked Server Performance And Options

October 23, 2024 Post a Comment

At work we have two servers, one is running an application a lot of people use which has an SQL Server 2000 back end. I have been free to query this for a long time but can't add a

Solution 1:

Avoid joins to linked server tables.

Using a four part naming for your join can be used but is more expensive. Your join could contain criteria that can be used to limit the data set from the linked server and use the indexed columns.

Example:

SELECT loc.field1, lnk.field1
FROM MyTable loc
INNERJOIN RemoteServer.Database.Schema.SomeTable lnk
  ON loc.id = lnk.id
  AND lnk.RecordDate = GETDATE()
WHERE loc.SalesDate = GETDATE()

This query is also applying a criteria in the join that can be used by the linked server before the join is calculated.

The recommended method is the use of OPENQUERY.

By avoiding the join with the use of OPENQUERY the local server only sends the query to be executed remotely instead sending a set of IDs for the join.

Use the link to retrieve a set of data and perform the calculations locally. Either use a temporary table (for ad hoc queries) or insert the row in a permanent table in a nightly job.

Begining transactions may fail depending if the remote transaction coordinator is set in the liked server. Using it will consume more resources.

Also consider that you are hitting a production server running an application, while you do not specify it, I think is safe to assume that is using heavy transactions and doing inserts and updates. You are taking away resources away from the application.

Your purpose appears to be the use of the data for reporting purposes. Your server can be set to have a simple log instead of full making it more efficient.

You will also avoid your queries to be canceled due to data movement on the linked server. Always be mindful of setting the proper isolation level for your queries and table hints like NOLOCK.

And PLEASE! Never place an OPENQUERY (or any linked server) inside a loop!

Solution 2:

When you use linked servers for joins like this, it is important to have the server you are immediately connected to ("local") be the one with the most of the data, where the linked server is only providing a small part of the data, otherwise, yes, it will pull as much data as it needs to perform the join.

Alternatives include copying a subset of the data across to a temporary table with as much work done to slim down the results and any pre-processing that the linked server can perform, and then do the join on the "local" side.

You may find you can easily boost performance by reversing the way you do it, connecting to the server you have no control over (they'll need to make a linked server for you) and then connecting to your server over the link. If you need to do major work with the data where you would have to create sprocs - then push the data onto your server and use your sprocs there.

In some cases, I simply had the linked server perform a nightly creation of this kind of summary which it pushed to the local server, and then the local server performed its work with the join.

Solution 3:

Royal pain

We used to have several linked servers at our shop and it turned out to be such a PITA.

First of all, there were severe performance problems similar to what you describe. I was shocked when i saw network I/O stats. Despite all efforts, we failed to hint SQL Server into reasonable behavior.

Another problem was that stored procs had these linked server names hardcoded everywhere, with no way to override them. So developers couldn't easily test on their development sandboxes any functionality that touched linked servers. This was a major obstacle for creating a universally usable unit-test suite.

In the end we ditched linked servers completely and moved data synchronization to web-services.

Solution 4:

Queries involving semi-joins across a linked server tend not to be very efficient. You might be better off using OPENQUERY to populate data into a local temporary table and then work on it from there.

Solution 5:

I wrote a remote Linked Server application in SQL 2000 a couple of years ago and came across the same performance issues you describe. I ended up rewriting my stored procedures several times in order to obtain the best performance.

I used temporary tables extensively. I found that it was less expensive to retrieve large amounts of remote data into a temp table, then join to it, manipulate it, etc. Joining local to remote tables was very slow as you desribe.

Display Execution Plan and Display Estimated Execution Plan tended to help although I did not understand a lot of what I was looking at.

I don't know if there really is a efficient way to do these queries with a remote server because it seems like SQL Server cannot take advantage of its normal optimizations when going against a Linked Server. It may feel like you are transferring the entire table because in fact that is what is happening.

I am wondering if a replication scenario might work for you. By having the data on your local server, you should be able to write normal queries that will perform as desired.

I do not know of any good articles to point you towards. As I write more complicated SQL Server applications, I started to think that I needed a better understanding of how SQL Server worked underneath. To that end we bought the MS Press Inside Microsoft SQL Server 2005 series edited by Kalen Delaney here at work. Volume 1: The Storage Engine is definitely the place to start but I have not gotten that far into it. Since my last few projects have not involved SQL Server, my study of it has gotten lax.

comprasconencanto1