Pixar's RenderFarm



Sun worked closely with a team from Pixar to create its RenderFarm, which serves as Pixar's central resource of computer processing power. The RenderFarm uses a network computing architecture in which a powerful SPARCserver(TM) 1000 acting as a "texture server" supplies the necessary data to the many rendering client workstations needed to complete the rendering process.

The RenderFarm was assembled by Sun and Pixar engineers in less than a month and drew upon Sun's own experience in setting up "farms" of many systems linked together. Some facts about Pixar's RenderFarm and the computing aspects of "Toy Story":

-- The RenderFarm is one of the most powerful rendering engines ever assembled, comprising 87 dual-processor and 30 four-processor SPARCstation 20s and an 8-processor SPARCserver 1000. The RenderFarm has the aggregate performance of 16 billion instructions per second -- its total of 300 processors represents the equivalent of approximately 300 Cray 1 supercomputers.

-- Each system is the size of a pizza box, and all 117 systems work in a footprint measuring just 19 inches deep by 14 feet long by 8 feet high.

-- Sun is the price/performance leader, in Pixar's own rankings. The SPARCstation 20 HS14MP earned a rating of $80 per Rendermark (a Pixar measurement for rendering performance), while the comparable SGI Indigo Extreme came in at approximately $150 per Rendermark.

-- Using one single-processor computer to render "Toy Story" would have taken 43 years of nonstop performance.

-- Each of the movie's more than 1,500 shots and 114,000 frames were rendered on the RenderFarm, a task that took 800,000 computer hours to produce the final cut. Each frame used up 300 megabytes of data -- the capacity of a good-sized PC hard disk -- and required from two to 13 hours for final processing.

-- In addition to the high-resolution final rendering, the RenderFarm was also used to generate the test images animators needed to plan and evaluate lighting, texture mapping and animation. Since fast response is key in doing tests, RenderMan could produce test frames in as little as a few seconds.

-- Scalability is built-in: the RenderFarm can be upgraded (with more processors and disk storage) to a nearly four-fold performance level, without requiring any additional space. The RenderFarm also integrates seamlessly with Pixar's existing computer network containing different types of machines.

Now Dreamworks:

Shrek render farm 2001



Product - Quantity - OS - CPUs - Description



SGI Origin200 - 406 - IRIX 6.5 - Dual R10000 180MHz - 512MB SGI RAM, 3U + 1 rackmount, 9GB HDD

PC Vendor #1 - 292 - Linux - Dual PIII 450MHz - 1GB PC-100 SDRAM, 2U rackmount, 39GB HDD

SGI 1200 - 324 - Linux - Dual PIII 800MHz - 2GB PC-133 SDRAM, 2U rackmount, 39GB HDD

PC Vendor #2 - 270 - Linux - Dual PIII 800MHz - 2GB PC-133 SDRAM, 1U rackmount 39GB HDD

SGI O2 - 190 - IRIX 6.5 - Single R10000 - 256-512MB SGI RAM, 9GB HDD



Total



1482 cpus - 836 boxes - 443 dual processor Linux boxes, 203 dual O200s, and 190 O2s.

I've cut a piece of an article and am giving it an update comparing it to my own render farm.URL: http://findarticles.com/p/articles/mi_m0EIN/is_1995_Dec_4/ai_17812444/ Nobody worked closely with me. I'm using outdated 2005 hardware that was destined for a recycler and I got some assistance from [H] member kogepathic through AMD forum member AndersN, luckily, which netted me a BIOS that gave my motherboards a shot in the arm for double the performance capability. I don't need an amazing server to handle the load of ten compute nodes, or even twenty if I decide to go that far (note: I wrote this when I was planning ten nodes and ended up with eight).In 1995, they were using 10BaseT Ethernet their SPARCStation 20s came with. Okay, maybe 100BaseT if they added expansion cards, but they didn't. Texture server... probably because the nodes were limited to 512MB RAM. Likely so, since the SPARCserver 1000 was able to take 2GB RAM. Each of my nodes will take 2GB almost for free and 8GB without costing me too much, while the server will take 16GB.Theirs required engineers. Mine required... me, power tools, good timing. That, and a lot of determination. I wish I could do it in a month. Hell, it's taking me half a year the way I've planned it out. (This was written when my target completion date was April 29 to coincide with Ubuntu 10.04 LTS being released).That's right! One of their processors equals a Cray 1. Yes. In 1979, Popular Science said the Cray 1 "will cruise along at 80 MFLOPS." That's an aggregate speed of 24GFLOPS. Mine, at 281GFLOPS (492GFLOPS in 14-node form), wipes the floor with theirs. Granted, I'm doing this fifteen years after the Toy Story farm was built, with technology from 2005. One AMD 275 Dual core CPU (17.6 GFLOPS) almost matches their entire 1995 farm--and I have sixteen of them. (28 now)My farm fits within four square feet and it's counter-top height. (Okay, that was with the short rack. It now fits within eight square feet and that includes the cooling system.)(note: Sun? Oracle, now.) They're comparing it to an Indigo2 Extreme, not an Indigo "1". The article has a typo. They compared an I2 with a 200MHz R4400 CPU. Slow end of the I2. I know this because I own an I2 R10K-195 Impact. I don't know a thing about their metric, but since my entire rig is intended to fall within a budget of $3000 (note: still within that range)... I think I'm getting some good value here. I saw that the 30 quad processor machines cost $47,395 each at the time. The 87 duals were $43,895 each. Okay, their CPUs ran at 100MHz in both dual and quad machines. I just did the math and found that cluster, minus any discounts they may have gotten, cost $5.24 million. That's almost 1,750 times my own cluster budget.So 80MFLOPS means 43 years? Let's do a little math here. I'm going to surmise their farm did it in 1/300th as long with 300CPUs. That's 52 days and eight hours. To render Toy Story on one of my AMD 275s would take 71 days and 8 hours. One dual-CPU node would take 35 days and 16 hours. My whole farm in 16 CPU form would do it in 4.5 days, using 22 dollars worth of electricity. (In 28-CPU form, 2 days 13 hours)300MB? That's it? Okay, granted, the hard disk I was using in 1995 was 420MB. I could fit 300MB in a small corner of my RAM, let alone a modern hard disk. My boot disks are microdrives 6GB in size and that's a desktop hard disk size from 1998. (again, obsolete data. I planned the microdrives when I was going to use the Verari nodes as-is. I have 160GB disks now, which makes this even more funny.)The film was rendered at 1536x922 pixels. I'm really only going for 1280x720, which is 65% as many pixels. I don't know how big of a deal the number of pixels actually is anymore.Scalability isn't a luxury these days--it's a requirement. I wouldn't be worth a damn if I couldn't connect a lot of computers and have them cooperate. A 24-port gigabit switch will allow care and feeding of sixteen nodes with eight lines to anything else I want to link to, such as servers, workstations and NAS boxes. Two ports for a NAS box, two ports for a server and four ports for a shotgunned link to the workstation switch. I doubt I'll realistically need more than twenty nodes. With technology moving the way it is, I'll be able to replace the motherboards by the time the cluster is inadequate. Only the power supplies may need changing.Pentium 3 CPUs manage one FLOP per cycle.P3-450 boxes: 131.4 GFLOPSSGI P3-800s: 259.2 GFLOPSP3-800 boxes: 216 GFLOPSSGI R10000 CPUs manage 2 FLOPs per cycle.180MHz Origin 200s: 146.16 GFLOPs200MHZ (estimated) SGI O2s: 76 GFLOPs828.76 GFLOPs totalby the same standard of manufacturer design numbers, my Opterons nuke four FLOPs per clock cycle.16 2.2GHz DC Opterons: 281.6 GFLOPsMy farm is 34% the speed of the 2001 Dreamworks farm. That's a bit of perspective there. I would need 24 nodes of the same config to have that kind of performance. I'll keep what I've got. (note: No, I won't. 14 nodes gives me 60% of the Shrek farm's performance.)