MS SQL Server: cursor vs join, on a HUGE dataset

mavalpha · Oct 17, 2012

I know the mantra, that cursors kill performance. However, in this case the dataset is enormous- I have a table with dozens of rows, and another with millions or more and the current (production) process iterates through the multi-million recordset once for each of the dozens. I know exactly how to replace the cursor with a join between the two tables, but the question is will the resulting single temp table that's dozens of times larger negate the advantage of removing triggers over just a couple of dozen passes? The current process takes about 4 hours with 14 iterations, just as an example.

PTNL · Oct 17, 2012

mavalpha said:
I know the mantra, that cursors kill performance. However, in this case the dataset is enormous- I have a table with dozens of rows, and another with millions or more and the current (production) process iterates through the multi-million recordset once for each of the dozens. I know exactly how to replace the cursor with a join between the two tables, but the question is will the resulting single temp table that's dozens of times larger negate the advantage of removing triggers over just a couple of dozen passes? The current process takes about 4 hours with 14 iterations, just as an example.

There's nothing tangible thus far to provide more specific suggestions, so here's some high level questions you'll want to answer:

- What version of SQL Server are you running?
- What hardware is it running on? (CPU, RAM, drives and arrays, etc.) What does the Performance Monitor show you as far as hardware utilization and/or bottlenecks?
- What does the schema look like that you're returning, and what you are joining on?
- What kinds of time frames are you getting from profiler tools, such as SQL Profiler?
- What about executing sub-queries from your current query -- does including certain parts of the larger query drastically affect performance more than other blocks?
- Are indexes in use? If so, what are they focused on?

jad0083 · Oct 30, 2012

Hmm, this is somewhat confusing as Cursors and Joins aren't usually compared to each other. Maybe Cursors and While loops; Joins and Unions. Can you clarify with at least your cursor statement and what join statement you're comparing it to? Join, as the name suggests, is used to join datasets, whereas cursors are loops. Well, you can use join in a DML statement inside a cursor, but aside from that, not sure how they can be compared.

402blownstroker · Oct 30, 2012

Sounds like a data modeling and setup issue.

Does the larger table have any foreign keys to the smaller table? What are the indexes of the two tables?

With the correct foreign keys and indexes, two large tables can be joined fairly quickly. At work, we have a similar setup. The smaller table has about 650K rows in and the larger has about 18M. We consistently join the tables together in about 30 seconds bring back about 300K records.

mikeblas · Oct 30, 2012

Show the execution plan of the join and the table definitions. Otherwise, we're arguing about guesses.

Why do you assume there's a temporary table?

Zangmonkey · Oct 30, 2012

Even with a table having millions of rows (in fact, especially with a table having millions of rows) a JOIN can be more appropriate than a cursor.

You need to elaborate on the table design, what the purpose of the JOIN is, and what your intended result it.

mikeblas · Nov 7, 2012

The OP never posted again. That's pretty galling.

Zangmonkey · Nov 7, 2012

mikeblas said:
The OP never posted again. That's pretty galling.

To be fair, my dot-folders have never worked in this subform for some reason.

mikeblas · Nov 7, 2012

dot-folders?

Zangmonkey · Nov 8, 2012

mikeblas said:
dot-folders?

the icon to indicate when a thread has new replies.

PTNL · Nov 8, 2012

Zangmonkey said:
the icon to indicate when a thread has new replies.

That excuse doesn't hold up when you consider this thread had its first reply 16 minutes after its creation.

MS SQL Server: cursor vs join, on a HUGE dataset

mavalpha

[H]F Junkie

PTNL

Supreme [H]ardness

jad0083

[H]ard DCOTM March 2022

402blownstroker

[H]ard|DCer of the Month - Nov. 2012

mikeblas

[H]ard|DCer of the Month - May 2006

Zangmonkey

Supreme [H]ardness

mikeblas

[H]ard|DCer of the Month - May 2006

Zangmonkey

Supreme [H]ardness

mikeblas

[H]ard|DCer of the Month - May 2006

Zangmonkey

Supreme [H]ardness

PTNL

Supreme [H]ardness