GF104 does scale great, my scores come close to doubling when comparing single GPU to SLI. But the highest score I can get with 2 x 460 overclocked 900/1800/2000 with a 980X and PCIe @ 110MHz is 24.9 without reducing the Nvidia control panel settings from default. My score did go up about 10% by running fullscreen versus windowed, those who ran the challenge windowed should re-run it fullscreen.
That said I think tessellation performance of the GF104 is a bit hobbled compared to the GF100/110 GPUs, beyond the expected difference from clock rates and number of SM cores. I still don't think that the 460 SLI can match a single 580 with these crazy high levels of tesselation, but that may not be the case with games and other benchmarks (including Unigine Heaven) with more typical levels of tesselation.
Could you re-run with LOD bias optimized and lowest quality texture filtering? I'd like to see if that helps or if this benchmark is bound by something else.