Kaby Lake

SomeGuy133

2[H]4U
Joined
Apr 12, 2015
Messages
3,447
Is the release H1 or H2 for 2016? I haven't heard one way or another with it and found thigns saying H1 and H2

EDIT: nm I see it was delayed and updated to September 2016
 
Last edited:
I wouldn't expect much out of that release, didn't they just make up Kaby Lake when they delayed something else?
 
"Introduce 256 byte block cache line for L1 cache (twice the bandwidth from Haswell and Skylake)"

I guess that's something.
 
I wouldn't expect much out of that release, didn't they just make up Kaby Lake when they delayed something else?

Seemingly so, since Cannonlake was slated/believed to be Skylake's successor.
 
it was supposed to be like Q2 release but was moved back to Sepetember release. I am getting a binned HW since it'll only be a little slower then SKL. I dont want to get SKL and have to re upgrade 9 months later to KBL because it doesn't support Optane. Also there is a chance that KBL will actually be socket compatible with Cannon Lake. They use the same socket
 
"Introduce 256 byte block cache line for L1 cache (twice the bandwidth from Haswell and Skylake)"

I guess that's something.

That's actually a big deal. Something that has been wanted for a while but in the past proved to actually hurt performance but this is something that is supposed to be a big deal.
 
Would also be interesting if they offered socketed SKUs with eDRAM as a L4 cache.

What I'm disappointed in though is the PCIe 4.0 delays and hence delayed adoption.
 
Would also be interesting if they offered socketed SKUs with eDRAM as a L4 cache.

What I'm disappointed in though is the PCIe 4.0 delays and hence delayed adoption.

yea the lack of PCI bandwidth is going to kill WS with high speed stuff. I am currently building a WS that is 100% tapped out.

GPU
XPoint (unsure if it will be supported on X99 I may have to upgrade :/)
10 Gb E
HBA
SAS Expander
USB 3.1 (If I could get rid of this I would be using some tuners)

I will have all 6 ports filled and just enough BW for everything. since I got 40 lanes.

The lack of PCI E lane and speed is going to murder Knights Landing too. I would buy one of those is this one thing could run on it but I doubt that will ever happen but I can dream.
 
You won't really see a huge improvement in consumer PCI-E lane numbers unless the storage and multimedia manufacturers start asking for it. Intel knows as a consumer, you can't just go pick another superior product with the PCI-E lanes they won't give you unless you jump to enterprise level gear, which they also happen to control the market in for x86 :p
 
You won't really see a huge improvement in consumer PCI-E lane numbers unless the storage and multimedia manufacturers start asking for it. Intel knows as a consumer, you can't just go pick another superior product with the PCI-E lanes they won't give you unless you jump to enterprise level gear, which they also happen to control the market in for x86 :p

that has nothing to do with it. The organization that makes it has not released a new standard. Intel is not the major/only company in charge of the standard. Intel and 3 other people make the standard. 40 Lanes is even a limiting factor in major work scenarios.

Intel even knows that PCIe is out dated and XPoint makes that clear. A single XPoint PCIe SSD requires 16X slot. Plus Xeon Phi is going to cause serious problems for PCIe. Xeon Phi knights landing is going to be 3TFlops and its going have a major penalty if it has to cache data to DDR4 memory. The "far" memory is going to kill it. Hence why they are giving it a massive 16GB of essentially Memory Cube.
 
that has nothing to do with it. The organization that makes it has not released a new standard. Intel is not the major/only company in charge of the standard. Intel and 3 other people make the standard. 40 Lanes is even a limiting factor in major work scenarios.

Intel even knows that PCIe is out dated and XPoint makes that clear. A single XPoint PCIe SSD requires 16X slot. Plus Xeon Phi is going to cause serious problems for PCIe. Xeon Phi knights landing is going to be 3TFlops and its going have a major penalty if it has to cache data to DDR4 memory. The "far" memory is going to kill it. Hence why they are giving it a massive 16GB of essentially Memory Cube.


You're trying to argue that enterprise level gear is the reason PCI-E is outdated on consumer PCs?

As someone who works in the tech industry, when I see someone making the argument you just did I have one response: Stop crying and cough up the money for a real professional grade system if you're that constrained. It irritates me to no end when a prospective customer wants to shoehorn a render box worth of crap into a workstation to save money, then are disappointed at the result or cry about how technology needs to advance faster to suit them.
 
Ha hardly. So one example was my nas server using like over 40 lanes...well really something is obviously is going to have to get less lanes it needs. Another example is wanting to run a phi for a program if it gts supported. X99 is a common xeon mb for using phis. What magicwl enterpris3 cpu does 4.5ghz and hs more then 40 lanes or better bw? None

Another example is my desktop. I am single thread limited so skl and kaby lake are the only options so wtf am i supposed to do when i want 1-2 gpus, but 10 Gb E, and XPoint. X99 lags behind in single thread and ill upgrade for a mere 10% gain anytime i can

3 pieces of pci e hardware that needs nearly 36 lanes to run perfectly
 
Ha hardly. So one example was my nas server using like over 40 lanes...well really something is obviously is going to have to get less lanes it needs. Another example is wanting to run a phi for a program if it gts supported. X99 is a common xeon mb for using phis. What magicwl enterpris3 cpu does 4.5ghz and hs more then 40 lanes or better bw? None

Another example is my desktop. I am single thread limited so skl and kaby lake are the only options so wtf am i supposed to do when i want 1-2 gpus, but 10 Gb E, and XPoint. X99 lags behind in single thread and ill upgrade for a mere 10% gain anytime i can

3 pieces of pci e hardware that needs nearly 36 lanes to run perfectly

The IBM POWER 8 has 48 lanes per CPU module and is available at up to 5.0 Ghz clockspeeds, and I'm sure that counts as enterprise. It's not uncommon to have HUNDREDS of PCI-E lanes in true enterprise hardware via multiple CPU modules and bridges. Those bridges are available for x86 enterprise gear too.

The problem is you chose to go with the x86 platform, and you probably don't have access to the sales materials for anything meant for more the SMBs. Also, you're just being hard headed. You're crying about trying to run all this on your desktop? FFS stop being so cheap then and actually cough up the money to do more than just buy a few trendy new technologies and plug them into a retail consumer board.
 
The IBM POWER 8 has 48 lanes per CPU module and is available at up to 5.0 Ghz clockspeeds, and I'm sure that counts as enterprise. It's not uncommon to have HUNDREDS of PCI-E lanes in true enterprise hardware via multiple CPU modules and bridges. Those bridges are available for x86 enterprise gear too.

The problem is you chose to go with the x86 platform, and you probably don't have access to the sales materials for anything meant for more the SMBs. Also, you're just being hard headed. You're crying about trying to run all this on your desktop? FFS stop being so cheap then and actually cough up the money to do more than just buy a few trendy new technologies and plug them into a retail consumer board.

Agreed.
I manage around 5K physical servers.
From small x86 blades to big HP UX rigs.

There are right tools for the job...and there are bad tools for the job
Whining over you pick the wrong tool (and at the same time blaning technology) is stupid...and annoys me too.
 
The IBM POWER 8 has 48 lanes per CPU module and is available at up to 5.0 Ghz clockspeeds, and I'm sure that counts as enterprise. It's not uncommon to have HUNDREDS of PCI-E lanes in true enterprise hardware via multiple CPU modules and bridges. Those bridges are available for x86 enterprise gear too.

The problem is you chose to go with the x86 platform, and you probably don't have access to the sales materials for anything meant for more the SMBs. Also, you're just being hard headed. You're crying about trying to run all this on your desktop? FFS stop being so cheap then and actually cough up the money to do more than just buy a few trendy new technologies and plug them into a retail consumer board.

Agreed.
I manage around 5K physical servers.
From small x86 blades to big HP UX rigs.

There are right tools for the job...and there are bad tools for the job
Whining over you pick the wrong tool (and at the same time blaning technology) is stupid...and annoys me too.

as i said i am single thread limited in many tasks (particularly for my main desktop)...that power8 or whatever sucks....like seriously...its crap.
http://www.anandtech.com/show/9193/the-xeon-e78800-v3-review/11

So a 5GHz Power8 has the same single thread as a 3.3GHz Haswell....wow is that a fail. Granted, only one benchmark but thats a massive deficit.

Yes I am completely out of line to expect to be able to use 10Gb E, XPoint, and a GPU with a CPU that has the best single thread available. /sarc

That is honestly not asking much.

Also plex? you got to be shitting me. Why the hell would I use something that would further add latency to a piece of hardware that is designed to give me the lowest latency memory standard possible. -_- Great intelligent advice /sarc


The only option is putting the 10Gb E on the PCH and giving 8x to XPoint and GPU but that will result in performance penalties but at least not massive ones.
 
Last edited:
as i said i am single thread limited in many tasks (particularly for my main desktop)...that power8 or whatever sucks....like seriously...its crap.
http://www.anandtech.com/show/9193/the-xeon-e78800-v3-review/11

So a 5GHz Power8 has the same single thread as a 3.3GHz Haswell....wow is that a fail. Granted, only one benchmark but thats a massive deficit.

Yes I am completely out of line to expect to be able to use 10Gb E, XPoint, and a GPU with a CPU that has the best single thread available. /sarc

That is honestly not asking much.

Also plex? you got to be shitting me. Why the hell would I use something that would further add latency to a piece of hardware that is designed to give me the lowest latency memory standard possible. -_- Great intelligent advice /sarc


The only option is putting the 10Gb E on the PCH and giving 8x to XPoint and GPU but that will result in performance penalties but at least not massive ones.

YES, you actually are out of line because you're not being realistic. The type of equipment you're asking for has zero market outside of servers who usually don't give a damn about single threaded performance.

You asked for an example of a CPU with certain specs. I gave you one and your response is to try to take one review and dismiss it even though it exceeded what you asked for an example of. And for the record, that review isn't even complete :p POWER 8 can boost its single threaded performance by shutting down cores on the module on the fly, and clocking the remaining cores sky high. Let's see your cute little Haswell toy do that.

I'm sorry you are too poor to afford real enterprise equipment to support your need to having bragging rights to prop up your ego in an adequate manner and have to resort to crying on the forums because the real world doesn't cater to people like you. You've already had TWO professionals in the hardware business flat out tell you you're out of line and yet you persist.

Also this is right from the review you linked:
"Comparing Xeon E7 v3 and POWER8

Although the POWER8 is still a power gobbling monster, just like its older brother the POWER7, there is no denying that IBM has made enormous progress. Few people will be surprised that IBM's much more expensive enterprise systems beat Intel based offerings in the some high-end benchmarks like SAP's. But the fact that 24 POWER8 cores in a relatively reasonably priced IBM POWER8 server can beat 36 Intel Haswell cores by a considerable margin is new."

Yeah POWER 8 sucks so much it beats your Haswell example with 1/3 less cores and comes with the extra PCI-E lanes you're crying about not having on your desktop.
 
Last edited:
YES, you actually are out of line because you're not being realistic. The type of equipment you're asking for has zero market outside of servers who usually don't give a damn about single threaded performance.

You asked for an example of a CPU with certain specs. I gave you one and your response is to try to take one review and dismiss it even though it exceeded what you asked for an example of. And for the record, that review isn't even complete :p POWER 8 can boost its single threaded performance by shutting down cores on the module on the fly, and clocking the remaining cores sky high. Let's see your cute little Haswell toy do that.

I'm sorry you are too poor to afford real enterprise equipment to support your need to having bragging rights to prop up your ego in an adequate manner and have to resort to crying on the forums because the real world doesn't cater to people like you. You've already had TWO professionals in the hardware business flat out tell you you're out of line and yet you persist.

Also this is right from the review you linked:
"Comparing Xeon E7 v3 and POWER8

Although the POWER8 is still a power gobbling monster, just like its older brother the POWER7, there is no denying that IBM has made enormous progress. Few people will be surprised that IBM's much more expensive enterprise systems beat Intel based offerings in the some high-end benchmarks like SAP's. But the fact that 24 POWER8 cores in a relatively reasonably priced IBM POWER8 server can beat 36 Intel Haswell cores by a considerable margin is new."

Yeah POWER 8 sucks so much it beats your Haswell example with 1/3 less cores and comes with the extra PCI-E lanes you're crying about not having on your desktop.

again not in single thread....you seriously cant read. My requirement was PCIe Lanes and single thread. Your making shit up as you go obviously.

Also as I stated above even 48 lanes is laughable when you have technology like Xeon Phi that can guzzel BW like no ones business. Currently, I don't have a use for the Xeon Phi because the program I use isn't threaded enough but I have been reaching out to the devs to see if they would support it and If they did Xeon Phi is a prime example of PCIe 3.0 lacking speed. No matter how many lanes you have it doesn't matter. PCIe can't access quad channel memory at full speed. It is a limiting factor that won't change until PCIe 4 so again you dont know shit. Also PCIe 4 x16 still can't tough quad memory speeds.

BTW because your a "professional" doesn't mean you know what your talking about because earlier you tried claiming a 5GHz power8 was fast in single thread, which was patently false as I showed with objective data.

But hey lets keep this going because this is comical.

Also I really don't want that power8 because it sucks for what I need so yea..you can keep your "enterprise" stuff.
 
Last edited:
again not in single thread....you seriously cant read. My requirement was PCIe Lanes and single thread. Your making shit up as you go obviously.

Also as I stated above even 48 lanes is laughable when you have technology like Xeon Phi that can guzzel BW like no ones business. Currently, I don't have a use for the Xeon Phi because the program I use isn't threaded enough but I have been reaching out to the devs to see if they would support it and If they did Xeon Phi is a prime example of PCIe 3.0 lacking speed. No matter how many lanes you have it doesn't matter. PCIe can't access quad channel memory at full speed. It is a limiting factor that won't change until PCIe 4 so again you dont know shit. Also PCIe 4 x16 still can't tough quad memory speeds.

BTW because your a "professional" doesn't mean you know what your talking about because earlier you tried claiming a 5GHz power8 was fast in single thread, which was patently false as I showed with objective data.

If you're software is single thread performance reliant, then why would they be willing to port it to a massively parallel Phi yet they won't properly multithread it for a normal cpu? If it was able to be parallelized that well they would have done so already. It makes it sound like you're just inventing something out of thin air to support your argument.


You linked to ONE review, which was using inaccurate specs even when it was presenting the review. You also apparently didn't read the entire review either as they even admit POWER 8 beat the Haswell by 2% when running single threaded software when you let the POWER 8 actually use all 8 of its threads per core. Yes, it actually can manage threading more intelligently than an Intel platform. Imagine that.

So anyways, some of us have real lives to live so enjoy your frustrated hardware reality that you've self imposed since you obviously know more than everyone else.
 
Last edited:
If you're software is single thread performance reliant, then why would they be willing to port it to a massively parallel Phi yet they won't properly multithread it for a normal cpu? If it was able to be parallelized that well they would have done so already. It makes it sound like you're just inventing something out of thin air to support your argument.


You linked to ONE review, which was using inaccurate specs even when it was presenting the review. You also apparently didn't read the entire review either as they even admit POWER 8 beat the Haswell by 2% when running single threaded software when you let the POWER 8 actually use all 8 of its threads per core. Yes, it actually can manage threading more intelligently than an Intel platform. Imagine that.

So anyways, some of us have real lives to live so enjoy your frustrated hardware reality that you've self imposed since you obviously know more than everyone else.
irrelevent. As I said that does not matter for my use case. Additionally, you have used nothing but hearsay and conjecture and provided no objective data so why would I believe you?

Also who said I only use 1 program? Who said I only needed something for a single case. Nearly everything is single threaded that you use on a day to day basis for desktops and the one program has nothing to do with my need for my main desktop or NAS/server. Your assuming all my needs are about 1 single project. I have 3 seperate needs and projects. I have no use for a Power8 for my NAS/Server. A 1650v3 works perfectly. Desktop needs single thread badly, which is why a SKL build makes sense except I get screwed on PCIe Lanes. Get me a 5GHz Haswell Xeon and I'll take it over a SKL but that wont happen will it smart guy? The other side project that I would use a Phi for if it was supported is a different topic. It too is currently single thread for the most part and the reason why it is still single thread is because the program is still in developement hence why I have been mustering support by others who use this program to try to get the devs to support Phi. Does Power8/9 have 3 TFlops for 150-250 watts? Didn't think so....so why would I again use that? Assuming the program was threaded enough.
 
irrelevent. As I said that does not matter for my use case. Additionally, you have used nothing but hearsay and conjecture and provided no objective data so why would I believe you?

Also who said I only use 1 program? Who said I only needed something for a single case. Nearly everything is single threaded that you use on a day to day basis for desktops and the one program has nothing to do with my need for my main desktop or NAS/server. Your assuming all my needs are about 1 single project. I have 3 seperate needs and projects. I have no use for a Power8 for my NAS/Server. A 1650v3 works perfectly. Desktop needs single thread badly, which is why a SKL build makes sense except I get screwed on PCIe Lanes. Get me a 5GHz Haswell Xeon and I'll take it over a SKL but that wont happen will it smart guy? The other side project that I would use a Phi for if it was supported is a different topic. It too is currently single thread for the most part and the reason why it is still single thread is because the program is still in developement hence why I have been mustering support by others who use this program to try to get the devs to support Phi. Does Power8/9 have 3 TFlops for 150-250 watts? Didn't think so....so why would I again use that? Assuming the program was threaded enough.

This migth be hard for you to understand, in your cart before horse-world, but your premise about that Intel should cater YOUR specific niche usage won't fly.

Your scenario is insignificant in the bigger picture...no one could create a businessplan that makes sense.

That doesn't mean technology is stagnating...it just means your ego is out of proportions.

Trying to shoehorn technology into your misconceived notion is your mistake, not some fault of technology...
 
"Introduce 256 byte block cache line for L1 cache (twice the bandwidth from Haswell and Skylake)"

I guess that's something.

Where are you getting that from? I'd love for that to be true.
 
Hopefully they get some eDRAM on the desktop again, would be good. I like the concept of Broadwell with eDRAM cache.
 
Back
Top