MS SQL Server: cursor vs join, on a HUGE dataset

mavalpha

[H]F Junkie
Joined
Jan 3, 2005
Messages
10,448
I know the mantra, that cursors kill performance. However, in this case the dataset is enormous- I have a table with dozens of rows, and another with millions or more and the current (production) process iterates through the multi-million recordset once for each of the dozens. I know exactly how to replace the cursor with a join between the two tables, but the question is will the resulting single temp table that's dozens of times larger negate the advantage of removing triggers over just a couple of dozen passes? The current process takes about 4 hours with 14 iterations, just as an example.
 
I know the mantra, that cursors kill performance. However, in this case the dataset is enormous- I have a table with dozens of rows, and another with millions or more and the current (production) process iterates through the multi-million recordset once for each of the dozens. I know exactly how to replace the cursor with a join between the two tables, but the question is will the resulting single temp table that's dozens of times larger negate the advantage of removing triggers over just a couple of dozen passes? The current process takes about 4 hours with 14 iterations, just as an example.
There's nothing tangible thus far to provide more specific suggestions, so here's some high level questions you'll want to answer:

- What version of SQL Server are you running?
- What hardware is it running on? (CPU, RAM, drives and arrays, etc.) What does the Performance Monitor show you as far as hardware utilization and/or bottlenecks?
- What does the schema look like that you're returning, and what you are joining on?
- What kinds of time frames are you getting from profiler tools, such as SQL Profiler?
- What about executing sub-queries from your current query -- does including certain parts of the larger query drastically affect performance more than other blocks?
- Are indexes in use? If so, what are they focused on?
 
Hmm, this is somewhat confusing as Cursors and Joins aren't usually compared to each other. Maybe Cursors and While loops; Joins and Unions. Can you clarify with at least your cursor statement and what join statement you're comparing it to? Join, as the name suggests, is used to join datasets, whereas cursors are loops. Well, you can use join in a DML statement inside a cursor, but aside from that, not sure how they can be compared.
 
Sounds like a data modeling and setup issue.

Does the larger table have any foreign keys to the smaller table? What are the indexes of the two tables?

With the correct foreign keys and indexes, two large tables can be joined fairly quickly. At work, we have a similar setup. The smaller table has about 650K rows in and the larger has about 18M. We consistently join the tables together in about 30 seconds bring back about 300K records.
 
Show the execution plan of the join and the table definitions. Otherwise, we're arguing about guesses.

Why do you assume there's a temporary table?
 
Even with a table having millions of rows (in fact, especially with a table having millions of rows) a JOIN can be more appropriate than a cursor.

You need to elaborate on the table design, what the purpose of the JOIN is, and what your intended result it.
 
The OP never posted again. That's pretty galling.
 
Back
Top