CERN Generates 1PB of Data Per Second

CommanderFrank

Cat Can't Scratch It
Joined
May 9, 2000
Messages
75,399
If you think you have gigantic storage problems with your computer data, think again. The experiments at CERN are generating a staggering 1 Petabyte of information every second of each experiment running on the Large Hadron Collider. I don’t know what they pay the IT staff there, but I’m sure it isn’t nearly enough.

“To analyse this amount of data you need the equivalent of 100,000 of the world’s fastest PC processors. CERN provides around 20 per cent of this capability in our datacentres, but it’s not enough to handle this data,” he said.
 
That's a lot of accelerated particle porn!!!!

100K fastest PC processors? Hrrmm....

CERN@Home project? :D
 
Petabit switch?

More like exabit, even though no such thing exists. They must have some beastly hardware to push all of that data around. AFAIK, even state of the art backbone routers top out at a few hundred terabits.
 
i remember reading about this in a magazine a couple years back. supposedly their computers/software was supposed to try and intelligently determine what events are worth recording and what aren't....i always wondered what they'd miss, but theres nothing out there that can handle such a huge amount of data.
 
Very interesting. Now my next question to all of you out there. How long before we have the technology to process this at the desktop level? 1 decade? Less? I'm betting on 15 years. The only road block would be the usefulness of software today.

IE: Businesses don't need newer processors for the average desktop user other than that core 2 duo for 4 to 5 years ago. New processor advancements boil down to doing more with less power for businesses.

If we can get slightly faster, take away the GPU and network layer, and reduce power consumption of a unit by say 15%. Businesses would jump ALL OVER that. This is the kind of thing IBM has been building towards for the past decade. And as is evident in the new i5 and so on line of CPU's. Unless your a gamer you don't need more power.

This is making the power users like you and I more and more of a niche market. We have advanced ourselves to a point that the newest and greatest is no longer cherished. What is cherished day to day is what you can do with what you have. Productivity be that entertainment geared or professional use geared is the next generation.

In the IT world the people that need the new i7 processors and heaps of memory are those going 64 bit and running multiple vm's on their workstations or laptops.

Joe schmoe typing in word or doing big excel spreadsheets will likely never cap out a C2D let alone a quad or 8 core CPU.
 
This is making the power users like you and I more and more of a niche market. We have advanced ourselves to a point that the newest and greatest is no longer cherished. What is cherished day to day is what you can do with what you have. Productivity be that entertainment geared or professional use geared is the next generation.


I blame Apple for dumbing down people with their "ecosystem"... power users come from the ranks of those who have to learn something to make all their toys play... now if you just make sure there is an apple logo on the back of it, you know it will work with all your other apple logo stuff...

good for dumb consumers... bad for enthusiasts as our bread and butter companies dumb themselves down to compete in that market...
 
That's a lot of data! :eek: But are they any closer to finding the Higgs Boson yet?
 
I read this on my phone and saw "Large Hardon Collider" and thought... hmm dick smashing

Then the brain kicked in and lisdexia faded and I saw hadron...the quantum physics particle

But seriously 1 petabyte per second seems a little ridiculous. Data would fill up in under 2 minutes... Unless of course they meant 800 quadrillion Q-bits (quantum bits)
 
so when are they going to make a CERN GPGPU application so we can all help ?
 
CERN only records data when the particle collision occurs. And that collision only last 1th billion of a second. So each experiment only lock about a few gig at best.
 
yeah but don't their experiments (i.e. the collisions) basically last for a nanosecond?
 
Very interesting. Now my next question to all of you out there. How long before we have the technology to process this at the desktop level? 1 decade? Less? I'm betting on 15 years. The only road block would be the usefulness of software today.

IE: Businesses don't need newer processors for the average desktop user other than that core 2 duo for 4 to 5 years ago. New processor advancements boil down to doing more with less power for businesses.

If we can get slightly faster, take away the GPU and network layer, and reduce power consumption of a unit by say 15%. Businesses would jump ALL OVER that. This is the kind of thing IBM has been building towards for the past decade. And as is evident in the new i5 and so on line of CPU's. Unless your a gamer you don't need more power.

This is making the power users like you and I more and more of a niche market. We have advanced ourselves to a point that the newest and greatest is no longer cherished. What is cherished day to day is what you can do with what you have. Productivity be that entertainment geared or professional use geared is the next generation.

In the IT world the people that need the new i7 processors and heaps of memory are those going 64 bit and running multiple vm's on their workstations or laptops.

Joe schmoe typing in word or doing big excel spreadsheets will likely never cap out a C2D let alone a quad or 8 core CPU.

No home computer will ever reach this power, We will run into the three walls long before this power can be made any smaller than a medium sized home.
 
I blame Apple for dumbing down people with their "ecosystem"... power users come from the ranks of those who have to learn something to make all their toys play... now if you just make sure there is an apple logo on the back of it, you know it will work with all your other apple logo stuff...

good for dumb consumers... bad for enthusiasts as our bread and butter companies dumb themselves down to compete in that market...

Nope, you are wrong. It comes from the fact that hardware is greatly outpacing software and what average people do with that software. 90% of people don't need more than a dual core i3, hell even a core2duo is good for most. 90% of people don't need 8GB of ram or 3TB or hard drive space.

15 years ago, people upgraded because it made things they cared about and used faster. Web browsing, new OS, music, movies, etc. Now, with quad core really the minimum you can buy unless you try really hard, who needs more than that? Intel is releasing hex-core mainstream processors next year, but what average person really needs that? Hell, even most power users don't need that much processing power.

This is why you're seeing ARM rise in not only the mobile, but desktop and laptops in the coming years, because they do everything the average person needs with increased battery and decreased heat and power.

I suggest you tone down being a fanboy and use a little critical thinking.

No home computer will ever reach this power, We will run into the three walls long before this power can be made any smaller than a medium sized home.

How short-sighted and naive. I bet you think silicon will be used in chips forever, huh? Well, any informed person can see that there are no less than a dozen technologies ready to take the crown for the next paradigm shift. There will be 3d chips to bridge the gap, and not the 3d intel is currently talking about. From there you have things like integrated photonics, graphene and other carbon based solutions, things like memristors acting as cpus.
 
for probably 95% of the analysis going on at the LHC experiments, LHC@home is pointless,

The vast chunk of the analysis is done by users and each users analysis is different and requires access to detector information that cannot be easily choppped up and sent out. The complete geometry of a detector is many many GB's in size and is needed for every analysis, also recent alignment studies of the detector, it's state (info on 1000's of detector cells being on/off/faulty/noisey needs to be included for each run).

LHC@home from what I remember is for the actual accelerator simulations, what this artical is talking about is the data produced from the experiments on the accelerator, ATLAS, CMS, LHCb, ALICE.

the data is shipped out to different physics departments around the world using [ame="http://en.wikipedia.org/wiki/LHC_Computing_Grid"]LHC Computing Grid - Wikipedia, the free encyclopedia[/ame]. For ATLAS, there exists a tier system,

Tier 0 - CERN - Complete Copy (a lot of this is being on tape) - Restricted Access
Tier 1 - big national physics research sites like Fermilab, Rutherford Appleton Lab (RAL), etc.... - Complete Copy of RAW data plus some post processing - Restricted Access
Tier 2 - Large Universities/sites with large computational+storage power, partial copy of certain data streams and more post processsing - General Physics Access
Tier 3 - Other sites, basically not as large as Tier 2's, but still a decent cluster of CPU's available - General Physics Access

If you want to see a realtime monitor of the grid in action try http://rtm.hep.ph.ic.ac.uk/webstart.php , it will take a little while to get going (see 2-5mins connecting to sites) but once up and running you will be able to see all the sites around the world and the data transfers going on.

CERN is basically there for storage, it has it's own Tier1 as well for analysis jobs but the main idea is that users (the physicsts) send their analysis to the site that contains the data they want and it is then run locally there.

The vast chunk of analysis jobs take place at the tier2's and 3's which hold "slimmed/skimmed/thinned" (or other stupid terms, just think good-data with some simple analysis cuts), datasets produced at Tier1's, that the users are expected to run on, vastly reducing the data storage requirements but higher CPU requirements.

What this means in practise is that often a user will be sending out "jobs" to 2,3 if not more sites around the world that contain the chunks of data they need. I've had jobs running in Tokyo, Amsterdam and California just from one submit.

As a rough figure, each collision at ATLAS is about ~200mb in size (they like to say less, but it isn't in reality) at the stage we analyise it, you can shrink that down to ~50mb at the Tier2/3's when removing particle info you aren't interested in and some simple cuts.

In total the LHC has produced 1 fb of collisions, which is about (and this is very hand wavey) 1.25x10^(10) events at each experiment, I'll you do the sums on the amount of data already being saved around the world for each of the 4 experiments, but it's a heck of a lot.
 
This is why you're seeing ARM rise in not only the mobile, but desktop and laptops in the coming years, because they do everything the average person needs with increased battery and decreased heat and power.

ARM is also rising because it offers more efficient instruction set as to the really old x86 which is still used by almost everyone. The x86 set of instructions is one of the few techs that is still keeping computers from being more powerful. Its like playing your PS3 on a TV from the 70s.
 
How short-sighted and naive. I bet you think silicon will be used in chips forever, huh? Well, any informed person can see that there are no less than a dozen technologies ready to take the crown for the next paradigm shift. There will be 3d chips to bridge the gap, and not the 3d intel is currently talking about. From there you have things like integrated photonics, graphene and other carbon based solutions, things like memristors acting as cpus.

Lol are you serious? You think this will be cheap enough and reliable enough to replace silicon based chips?


There is no possible way that this kind of power can made for home use, ever, end of story.
 
Back
Top