Now that HBM is coming around, are RAM sticks a thing of the past?

alxlwson

You Know Where I Live
Joined
Aug 25, 2013
Messages
9,396
Title asks the question... 2018-2019, no more large sticks of ram, and instead small sockets with hbm packaged similarly to CPU?
 
HBM, GDDR and DDR ram all use more or less the same DRAM cells, just arranged in different ways.

The old traditional split was to make DDR low latency low bandwidth and GDDR high latency high bandwidth, because CPUs needed low latency above all else and GPUs needed high bandwidth above all else.

HBM is the "why not both?" technology, and will help along the inevitable merging of the GPU back onto the CPU - or, which would be really fun and interesting and disruptive, the merger of the CPU onto the GPU.

Oh, to answer your original question, HBM wouldn't be any more upgradeable on such systems as it is now on video cards. You would just buy CPU/GPU/HBM SOCs with sufficient memory for your needs at the beginning.
 
No because hbm is severely limited it can literally not be off the cpu package and the size of modules is limited I see hbm and ddr4 being together for a while maybe in laptops and appliance and tablets they will go hbm only.

Maybe hbm 2 or 3 will replace ddr4 or 5 but only if it can go into the same style package as ddr 4.

Unless the future of pc is going pure disposable but I doubt that will happen
 
i would hope and expect/like to see intel start selling chips with 32GB of HBM (er memory cube) but with the ability to add DDR4 sticks or DDR5 as added memory. with the potentional of XPoint on those DIMMs as well.

That would make some seriously snappy systems.

L1 cache
L2 cache
L3 cache
32GB of Memory Cube (intel micron version)
DDR DIMMS/DDR DIMMs with XPoint.just XPoint
PCI E SSDs

thats how i see memory and storage turning out to be in the next 5-10 years.


HBM will not be that expensive down the road and its benefits smack regular RAM around. There is a reason why Xeon Phi somes with 16GB of something similiar to Memory Cube.

Again reason why I expect this is becuase the lower latency is very beneficial. Read up on how EDRAM or whatever that is on those broadwell dies have helped those chips shine in various work loads. Read up on anandtech and how low latency memory makes a substantial difference and that is 90ns vs 70ns imagine 70-90ns to 20ns....huge difference. (those numbers are close. I forget the exact numbers but those are reasonable)
 
The real game changer might be coming in the form of 3D Xpoint Optane technology. Imagine nonvolatile memory which plugs in the RAM sockets and has RAM speed. In other words no difference between RAM and nonvolatile storage anymore. That would be total revolution and it may start happening from 2017.

entirely wrong. BW is not the issue. It is latency.

XPoint will be magnitudes better in latency over SSDs and PCIe SSDs but it will not touch DRAM or Memory Cube. A lot of RAMs latency comes from the distance traveled, which is why cache and HBM type memory are better because they are on the die/package. DRAM natural has lower latency than XPoint which is why DRAM is still better than XPoint even when both run on the same interface.

Again latency is the issue XPoint solves mass storage latency but not system latency. HBM/Memory cube help that due to being on package/die and having inherently lower latency compared to XPoint..
 
If anything I see massive ram drives becoming a common place thing use the super low latency 16gb or 32gb hbm on the apu to do both vram and regular ram then up to 128-256 ddr4 for the windows live with a ssd for a lightly compressed OS image loading drive and file storage.

To work with that how memory is handled would need to change at a fundamental level windows will need to be able to address all of that ram and handle it properly.

In all honesty I see hbm coming to apu and laptops and maybe tablets and phones long before it comes to everyday desktops
 
If anything I see massive ram drives becoming a common place thing use the super low latency 16gb or 32gb hbm on the apu to do both vram and regular ram then up to 128-256 ddr4 for the windows live with a ssd for a lightly compressed OS image loading drive and file storage.

To work with that how memory is handled would need to change at a fundamental level windows will need to be able to address all of that ram and handle it properly.

In all honesty I see hbm coming to apu and laptops and maybe tablets and phones long before it comes to everyday desktops

Intel makes 2 quad cores as far as i know and just butchers them so there is little reason to not make both quad cores with memory cube especially considering the performance increases. It is the only logical route. Saves money, energy, faster, and so on. HBM should in the long haul be cheaper to make than DIMMs from my understanding.

Not the first gen or two of course. It will be like OLED. expensive in the beginning but cheaper in the long term.
 
entirely wrong. BW is not the issue. It is latency.
XPoint will be magnitudes better in latency over SSDs and PCIe SSDs but it will not touch DRAM or Memory Cube. A lot of RAMs latency comes from the distance traveled, which is why cache and HBM type memory are better because they are on the die/package. DRAM natural has lower latency than XPoint which is why DRAM is still better than XPoint even when both run on the same interface. Again latency is the issue XPoint solves mass storage latency but not system latency. HBM/Memory cube help that due to being on package/die and having inherently lower latency compared to XPoint..

It is exactly what you say which proves you wrong. HBM is just like another level of cache. Evidently the more cache the better but even the current levels are enough for most uses, the real bottleneck is between the RAM and mass storage. Having nonvolatile storage in RAM sockets will be colossal jump. For example computers looking always-on.
 
It is exactly what you say which proves you wrong. HBM is just like another level of cache. Evidently the more cache the better but even the current levels are enough for most uses, the real bottleneck is between the RAM and mass storage. Having nonvolatile storage in RAM sockets will be colossal jump. For example computers looking always-on.

for initial loads but not for the actual application. It has been proven time and time again that lower RAM latency has a tangible improvement.

Please note the context of this post. I am referring to how programs in general run in regards to snappiness. This has to do once they are loaded into the system. Most things run in RAM or cache when you are using them. You need low latency and high BW to efficiently feed a processor and general OS functionality and programs functions are generally already in RAM from my understanding. For most day to day applications this should be the case but please correct me if i am wrong but this is what i have always read to be the case. This is something I intend to further research over this year in a pet project of mine in an effort to actually find what is the actual limitations of our current computers. I am eager to start my Kaby Lake build with XPoint (PCIe...DIMMS wont be out :/).

Go read intels documentation on Xeon Phi, Memory Cube, XPoint, and benchmarks of programs running on broadwell with that extra cache. Why does Xeon Phi use 16GB of Memory Cube? (its based off memory cube and not HBM but not technically memory cube since it hasn't be standardized yet the last i read) It is because of latency and BW but largely the latency (BW plays a role in latency depending on the size of request see my thread on 1st word 4th word and 8th word requests in regards to RAM latency). Look at anandtechs documentation about latency. It is all about latency and XPoint is only useful for the initial load since programs run in cache/memory. There is a massive penalty even accessing RAM over cache. Many programs have shown this with broadwells 64-128MB of extra cache and why Memory cube will be very dramatic. Anandtech showed there are tangible improvements in using lower latency RAM and this is talking about 60-100ns between different RAM sticks. But with HBM/Memory cube compared to RAM...we are talking about latencys as low as 30-50ns so going from 60-100ns to a freakin 30ns response time or so is game changing, which is why Xeon Phi uses a form of Memory cube to keep the airmount cores loaded (FYI this is the upcoming Phi). XPoint has substantially more latency than RAM so its only improving the initial loads of a program and not the actual running of the program itself because it should already be in RAM. Xpoint will be game changing too but it has no impact on actual programs once loaded into RAM and it was and is never going to replace RAM.

My point is the actual snappiness of a program is based off cache and system memory and not mass storage (also single thread)...programs are coded like crap but not total shit. Again please note i am talking about user interface and overall system responsiveness and not loading. Again correct me if I am wrong but most programs do not access storage constantly. Programs you use daily...Office, OS, browsers, widgets, programs core parts, and so on. Now if we are talking about Photoshop and editing....that's a different story. At least i hope thats the case....If we really access our SSDs for everything all the time than i am really excited to get my XPoint but i really doubt thats the case for the general user interface.

Latency plays a crucial role in maximizing the processor and latency kills processor's efficiency. That is why Broadwells L4 and low latency RAM show tangible improvements even in tasks that "100%" load a CPU already.

See below for a general reference of latencys in a system
L1 ~1ns
L2 ~3ns
L3 ~10ns
L4 ~30-50ns (eDRAM and other competitors)
RAM 60-100ns
XPoint DIMMs ~250ns
XPoint PCIe ~9000ns
 
Last edited:
Intel makes 2 quad cores as far as i know and just butchers them so there is little reason to not make both quad cores with memory cube especially considering the performance increases. It is the only logical route. Saves money, energy, faster, and so on. HBM should in the long haul be cheaper to make than DIMMs from my understanding.

Not the first gen or two of course. It will be like OLED. expensive in the beginning but cheaper in the long term.

But the problem is currently hbm only works when it is on the fpga package. The only current source is the fury x and fury? So high end enthusiasts video card. We're all salivating for the potential it has for changing the game with budget and embedded cpu/apu.

Unfortunately the limitations of requiring such a short connection are not something that can be overcome easily so hbm will likely never get moved off the cpu/gpu/apu package.

And I don't see computers doing away with all of the dimm slots in favor of on chip ram. Though that said it would allow for massive reductions in motherboard size to eliminate external ram slots even laptop slots use alot of room...
 
But the problem is currently hbm only works when it is on the fpga package. The only current source is the fury x and fury? So high end enthusiasts video card. We're all salivating for the potential it has for changing the game with budget and embedded cpu/apu.

Unfortunately the limitations of requiring such a short connection are not something that can be overcome easily so hbm will likely never get moved off the cpu/gpu/apu package.

And I don't see computers doing away with all of the dimm slots in favor of on chip ram. Though that said it would allow for massive reductions in motherboard size to eliminate external ram slots even laptop slots use alot of room...

Memory Cube is whats going on CPUs....not HBM.....
 
Back
Top