Intel Officials Allegedly Say that Apple Could Move to ARM Soon

I don't know if I'm dumber or smarter for having read this thread.
Haha, I know how you feel - the more you know, the more you realize you don't know. :D
Every time I read one of juanrga's posts, he brings me back to school and shows me just how little I actually know about ARM and x86-64 alike, but his posts certainly are nothing short of professional and insightful on what will come to pass! (y)
 
Haha, I know how you feel - the more you know, the more you realize you don't know. :D
Every time I read one of juanrga's posts, he brings me back to school and shows me just how little I actually know about ARM and x86-64 alike, but his posts certainly are nothing short of professional and insightful on what will come to pass! (y)


The only thing we "know" is that Apple has some incredibly powerful Workstation-class ARM CPUs they're installing in their iPad Pro. Apparently this gives every rumor site on the internets carte blanche to make up the Apple Rumor of the Month.

Did it ever occur to you fools that Apple is taking the slow route, encouraging developers to port professional applications to iOS, while Apple has been making the OS more capable?

The end plan to make this all work seamlessly is to merge iOS and MacOS apps development into one, making the old MacOS redundant.

In ten more years after Marzipan's official release , Apple will be through transitioning all SUPPORTED computing to iOS and ARM. That isn't rumor, it's a fact.
 
Last edited:
Obviously they don't scale linearly since in [H]'s own test, it showed that beyond 16 cores, the performance diminished greatly since it couldn't keep all 24/32 cores fed memory bandwidth (data transfer rate) properly.

Here is another site showcasing this very limitation:
https://www.pcworld.com/article/329...ng-amds-32-core-threadripper-performance.html

View attachment 145373

View attachment 145374

To quote the article:


That is the complete opposite of scaling linearly, as you stated it could.
Scaling linearly would mean no bandwidth (data transfer rate) loss with the addition of more nodes/cores, but with AMD's current design, that isn't happening at all; not trying to knock AMD or anything, the tech and interconnects just are what they are.

This makes no sense. You can out scale a linear equation.

Guess what

f(x) = 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 * x

is linear.

f(x) = 0.000,000,000,000,000,000,000,000,000,000,000,000,000,000,001 * x

is linear as well.

In big O notation you care not of the scalars just the variable.

Depending on how many times each node must be visited you have constant, logarithmic, fractional, linear, quadratic, polynomial, exponential, all the way up to factorial. There are a few in between that are made up of combinations of or limitations on these base factors.

For the most part any path algorithm will be linear. I really can't think of one that isn't.

Regardless. What magic sauce will the ARM ecosystem have that will allow it to magically scale better than any of the current typologies?

Again, x86-64's days are numbered, and if AArch64 (ARM64) continues to move forward, and with consumer-based cost-effective systems and development platforms (as Linus Torvalds stated was needed), then AArch64 will most certainly be the true death, and rightful successor, to x86-64.

ARM is a worthy contender, but their ecosystem needs work. I look forward to a robust platform and hope they can improve some of their limitations. Their vector system is really good at 3 channel data (RGB) but not so good a logic or swizzling. Their 64 bit implementation is well equal to x86 in terms of backwards compatibility but the driver space isn't there.

Take a look at this.



MickMake is a great reviewer of single board computers and does well showing the robustness of the platform along with the immaturity of it. His previous video did a good roundup of many of these platforms.

You can have all the processing power in the world at the best power envelope, but unless you have the tools for users to enjoy it, it's just a platform of frustration.

I wish them all the success in the world.
 
Last edited:
The only thing we "know" is that Apple has some incredibly powerful Workstation-class ARM CPUs they're installing in their iPad Pro. Apparently this gives every rumor site on the internets carte blanche to make up the Apple Rumor of the Month.

Did it ever occur to you fools that Apple is taking the slow route, encouraging developers to port professional applications to iOS, while Apple has been making the OS more capable?

The end plan to make this all work seamlessly is to merge iOS and MacOS apps development into one, making the old MacOS redundant.

In ten more years after Marzipan's official release , Apple will be through transitioning all SUPPORTED computing to iOS and ARM. That isn't rumor, it's a fact.
Did it ever occur to you that this is what I have been saying throughout this thread? :confused:
I agree with you (accept for the "you fools" comment), and once Apple has that ARM64-based developer platform, things will take off in ARM64 space like never before, and this is just what Linus Torvalds said needed to happen.

Not that I think Apple is wonderful or anything, but if it starts making ARM64 (AArch64) more mainstream with laptops and/or desktops, we may see competition from other vendors and platforms favoring ARM64 over x86-64.
It's just speculation, but the more I'm seeing, the more it is starting to sound like a reality. (y)
 
I don't think you understand big O notation then.

AMD's interconnects scale linearly.

Nope. Linear means it scales as O(N). The interconnect inside CCX scales as O(N²). And the interconnect used to link dies on EPYC/Threadripper also scales as O(N²). This bad scaling is the reason why there is only 4 cores in a CCX (8 cores would be better).

its a great processor to be sure and delivers quite a bit of power for its package.
but it still lags behind x86 .

i use arm on a daily basis ... in the note 9, i encounter it on the iphone, it is lacking .

and i dont care about the Specs benchmark , for what ever purpose they are for they dont do jizzle bits for me . They dont seem to mean or relate to anything meaningful when it comes to real world use.

I hope you aren't comparing a phone processor to a laptop/desktop processor. Are you? If you replace the ARM chip in your phone with a x86 chip (Zen, CoffeLake, or some Atom) your phone wouldn't be only slower, but it would be also bigger and hotter.

SPEC benchmarks are a mainstream way of measuring performance in the industry. And the whole suite is using workloads developed from real user applications.
 
Did it ever occur to you that this is what I have been saying throughout this thread? :confused:
I agree with you (accept for the "you fools" comment), and once Apple has that ARM64-based developer platform, things will take off in ARM64 space like never before, and this is just what Linus Torvalds said needed to happen.

Not that I think Apple is wonderful or anything, but if it starts making ARM64 (AArch64) more mainstream with laptops and/or desktops, we may see competition from other vendors and platforms favoring ARM64 over x86-64.
It's just speculation, but the more I'm seeing, the more it is starting to sound like a reality. (y)


When I said "You fools," I was referring to the opening paragraph, i.e. "every rumor site on the Internets," and the people who mindlessly repost their trash as "news." :rolleyes:

Sorry, I should have said "those fools," but it was late and I can't always English :D

People keep claiming that Apple will transition their powerful laptops over the ARM in the near future, but they already did it a few years back. It's just going to be another decade before he transition is complete.
 
Last edited:
Nope. ARM64 is an independent clean ISA developed from scratch. ARM engineers used the transition to 64bit to eliminate all the legacy modes, useless instructions, and weird stuff from the ISA. This is different from AMD64, which is only an extension to x86, and so has inherited all the useless and weird stuff from x86. There are hundred of x86 instructions aren't used anymore in new code, but those ancient instructions have to be supported by modern 64bit chips thanks to the smartness of AMD engineers.

That has nothing to do with what I said.
 
Nope. Linear means it scales as O(N). The interconnect inside CCX scales as O(N²). And the interconnect used to link dies on EPYC/Threadripper also scales as O(N²). This bad scaling is the reason why there is only 4 cores in a CCX (8 cores would be better).

It's a point to point protocol. It's literally A to B. The complexity is 1, so even if it was O(N²), having to do N x N operations, 1 x 1 = 1.

Imagine sorting 2 points. You can only do one operation.

Regardless of your bastardization of big O notation, what magic sauce makes one interconnect topology greater than another and what does ARM's mesh do better?
 
Nope. Linear means it scales as O(N). The interconnect inside CCX scales as O(N²). And the interconnect used to link dies on EPYC/Threadripper also scales as O(N²). This bad scaling is the reason why there is only 4 cores in a CCX (8 cores would be better).



I hope you aren't comparing a phone processor to a laptop/desktop processor. Are you? If you replace the ARM chip in your phone with a x86 chip (Zen, CoffeLake, or some Atom) your phone wouldn't be only slower, but it would be also bigger and hotter.

SPEC benchmarks are a mainstream way of measuring performance in the industry. And the whole suite is using workloads developed from real user applications.




You cant have it both ways, either arm is an x86 killer or arm is an alternative, if its a killer its going to be compared against x86. regardless of application.

i have seen this argument before its absolute hogwash, if you want the processor to be an x86 killer than it had better perform better than an x86, it dont.

and when you start trying to point out, well the x86 dont fit in cellphone , we dont care ... we know x86 is not big power in small packages no one is disputing that .

its this whole arm mantra coming in and saying x86 is dead bro.. and there is jack to support it.

while that specsmark may be "using workloads developed from real user applications"

it is not reflected in other bench marks across other real world applications.


Even taking a little more time to read up on arm.

even its power saving are not as great as they are made out to be because it starts sucking down the power when under load and seemingly has a heat issues.

and even more so damming is the complete lack of arm architecture out there.

illustrated in this article https://www.servethehome.com/updated-cavium-thunderx2-power-consumption-results/ and its companion article https://www.servethehome.com/cavium-thunderx2-review-benchmarks-real-arm-server-option/

while were still looking at a bunch of meaningless bench marks at least we get a few relatable, and in this one it shows arm absolutely crushing the x86 in z7ip yet it still its lagging behind in others.

ok so how about we stop with the x86 killer stuff..

arm is not .

it as an alternative that brings options to the table . and options are always a good thing. (especially when it might make intel wake the hell up)

only way arm will ever replace x86 is if the entire industry just ups and gives up on x86 and even then that would set us back by about 5 years in performance if we all had to switch to arm.
 
I hope you aren't comparing a phone processor to a laptop/desktop processor. Are you? If you replace the ARM chip in your phone with a x86 chip (Zen, CoffeLake, or some Atom) your phone wouldn't be only slower, but it would be also bigger and hotter.
I will second this.
We have a few Panasonic Toughpads at work with Intel Atom quad-cores - those are tremendously slower, in overall performance, compared to their ARM counterparts, each running the same Android OS and version, too.

Not to mention, there are some massive x86-64 incompatibilities and bugs we've been dealing with for over a year now in Android compared to the ARM units which all have no issues.
Also, yes, comparing a phone processor to a laptop/desktop processor is just silly in their example, haha - agreed!
 
It's a point to point protocol. It's literally A to B. The complexity is 1, so even if it was O(N²), having to do N x N operations, 1 x 1 = 1.

Imagine sorting 2 points. You can only do one operation.

Regardless of your bastardization of big O notation, what magic sauce makes one interconnect topology greater than another and what does ARM's mesh do better?

And you are the guy that accuse others of non-understanding "big O notation"?

ROFL

AMD interconnects scale as O(N²). For a four-core CCX the interconnect has 6 links. An hypothetical eight-core CCX would require 28 links. The number of cores has duplicated (4-->8), but the number of links has quadruplicated 6-->28, because the interconnect scales as O(N²), not linearly as you pretend. The interconnect doesn't scale up well, not only power would grow nonlinearly, but latency would also, killing performance.

The same happens with the IF interconnect in Naples, which scale up badly. On the other hand ARM has designed an interconnect that scales as O(N). Using that interconnect they can scale beyond 128 cores, whereas maintaining power and latency under control.
 
I will second this.
We have a few Panasonic Toughpads at work with Intel Atom quad-cores - those are tremendously slower, in overall performance, compared to their ARM counterparts, each running the same Android OS and version, too.

Not to mention, there are some massive x86-64 incompatibilities and bugs we've been dealing with for over a year now in Android compared to the ARM units which all have no issues.
Also, yes, comparing a phone processor to a laptop/desktop processor is just silly in their example, haha - agreed!

An OS originally developed to run on ARM architecture works better on ARM processors than on non-ARM processors. You don't say?
 
You cant have it both ways, either arm is an x86 killer or arm is an alternative, if its a killer its going to be compared against x86. regardless of application.

i have seen this argument before its absolute hogwash, if you want the processor to be an x86 killer than it had better perform better than an x86, it dont.

and when you start trying to point out, well the x86 dont fit in cellphone , we dont care ... we know x86 is not big power in small packages no one is disputing that .

its this whole arm mantra coming in and saying x86 is dead bro.. and there is jack to support it.

while that specsmark may be "using workloads developed from real user applications"

it is not reflected in other bench marks across other real world applications.

You can have both and alternative and a killer. ARM servers are alternative to x86 servers. And Apple is choosing ARM as alternative to x86.

ARM is a killer for reasons explained before. And already mentioned even Intel knows ARM will replace x86.

Even taking a little more time to read up on arm.

even its power saving are not as great as they are made out to be because it starts sucking down the power when under load and seemingly has a heat issues.

and even more so damming is the complete lack of arm architecture out there.

illustrated in this article https://www.servethehome.com/updated-cavium-thunderx2-power-consumption-results/ and its companion article https://www.servethehome.com/cavium-thunderx2-review-benchmarks-real-arm-server-option/

while were still looking at a bunch of meaningless bench marks at least we get a few relatable, and in this one it shows arm absolutely crushing the x86 in z7ip yet it still its lagging behind in others.

ok so how about we stop with the x86 killer stuff..

arm is not .

it as an alternative that brings options to the table . and options are always a good thing. (especially when it might make intel wake the hell up)

only way arm will ever replace x86 is if the entire industry just ups and gives up on x86 and even then that would set us back by about 5 years in performance if we all had to switch to arm.

You are doing the same mistakes that when you quoted Cloudfare blog. So I can do the same remarks.

First, before all, ARM doesn't win only in z7zip. It wins on SPEcint, STREAM, and whestone. For example on SPECint


Cavium-ThunderX2-SPEC-Int-Rate-Peak-gcc7.jpg


Second, STH tested a non-production sample, not the final chips.

Third, STH used GCC7 as in the above case, whereas in other cases they used GCC7 only for the ARM chip and the Intel/AMD compilers for Skylake and EPYC. As is well-known the Intel and AMD compilers cheat scores as SPEC by using compiler tricks. GCC doesn't cheat, so when comparing performance on all the hardware using GCC for all, ThunderX2 wins.

Fourth, GCC7 lacks any performance improvement for ThunderX2. GCC7 has performance improvements for both Zen and Skylake, but it lacks improvements for ThunderX2, because this is a new hardware. This is mentioned in the article, which you obviously didn't read: "The flip side to this is that we did not have the same level of optimized binaries that some of the custom Arm compilers, e.g. from Cray, would provide in this test."

Indeed, Cray worked with Cavium to add ThunderX2-specific optimizations to its compiler: CCE gives up to 25% higher scores than GCC. Also Cavium has worked with GCC guys to optimize the compiler for ThunderX2. The new GCC version gives about 10--15% higher scores than those quoted in STH review.

About power consumption. The updated review shows what happens when you test non-final systems. A simple FW update reduced power consumption by ~300W compared to the original article, and final silicon will reduce an extra 30--40W as mentioned in the updated article. Also they aren't exactly comparing the apples to apples. This is total platform power, and they did the best to configure the systems to look the same, but they aren't.

Finally, not only ThunderX2 is a 16nm SoC, whereas others are 14nm, which gives power advantage to the x86 chips, but the x86 chips are mature designs made by engineers with years of experience designing servers. The history of ThunderX2 is different. Broadcomm engineers, who lacked any former experience with ARM ISA or servers, designed the Vulkan microarchitecture. Their first server and their first ARM design. Broadcom finally abandoned servers and the Vulkan project and sold the IP. This was acquired by Cavium, which renamed it to ThunderX2, and finished the design. Neither Cavium engineers are expertises on ARM or servers. ThunderX2 is their second desing, and ThunderX was a completely different beast, with in-order cores without SMT,...

Conclusion, despite a node disadvantage and lack of experience Cavium first serious ARM server matches or even beats the best x86 chips than Intel and AMD can offer.
 
An OS originally developed to run on ARM architecture works better on ARM processors than on non-ARM processors. You don't say?
Um, OS X runs on x86-64 just as well as iOS does on AArch64, and iOS was directly derived from OS X.
I have also run multiple versions of Linux, and its respective programs, on ARM and x86/x86-64 alike, with great success and near-equivalent performance on each, depending on the CPU and optimization of the software, not to mention numerous other ISAs, both old and new.

We are trying to find ways to directly compare these two ISAs with the examples we have at hand, since, as multiple sites have stated (thanks jaunrga!), both AArch64 and x86-64 are vastly different architectures and porting does need optimization to take full advantage of NEON, and other, instruction sets.
Just saying "You don't say?" doesn't really add anything to this discussion, and really so far, you have only added your personal opinions to this thread - I would like to see you actually find some benchmarks, threads, sites, or something at least to backup what you have been saying.

I'm not saying you are wrong, but you aren't really adding anything outside of your opinion, and that doesn't hold much weight compared to the benchmarks and sources for both AArch64, and x86-64 alike, being shared.
Could you at least post something to backup what you are saying, please? :)
 
Um, OS X runs on x86-64 just as well as iOS does on AArch64, and iOS was directly derived from OS X.

And OSX and iOS aren't the SAME thing so that makes no sense. RT was derived from the NT kernel but it can not be seriously used as an analog for Windows 8.1 or Windows 10.

I have also run multiple versions of Linux, and its respective programs, on ARM and x86/x86-64 alike, with great success and near-equivalent performance on each

Really? Ok, what are these exact identically configured ARM and x86-64 desktops, workstations, laptops, and servers?

depending on the CPU and optimization of the software

Oh, so now we have to optimize the code to get that.................

Remember that for a moment.

We are trying to find ways to directly compare these two ISAs

No, you're not and that is the problem. You have decalred a winner before hand and you are trying to twist a narrative to fit that.

Could you at least post something to backup what you are saying, please? :)

Sure. Here is my proof that the issue is an OS issue not a hardware issue for x86 or advantage for ARM.
 
It's a point to point protocol. It's literally A to B. The complexity is 1, so even if it was O(N²), having to do N x N operations, 1 x 1 = 1.
How is the complexity "1" when the bandwidth (data transfer rate) is literally halved, going from 50GB/s to 25GB/s, when more modules are added?
That doesn't even make sense, and is the polar opposite of scaling linearly.
 
And OSX and iOS aren't the SAME thing so that makes no sense. RT was derived from the NT kernel but it can not be seriously used as an analog for Windows 8.1 or Windows 10.
Actually, on the back-end, yes, OS X and iOS are the same.
iOS was directly derived from OS X and each have been developed nearly simultaneously.

You are right about Windows RT compared to 8.1/10, though.

Really? Ok, what are these exact identically configured ARM and x86-64 desktops, workstations, laptops, and servers?
They aren't "identical", but are similar enough in performance benchmarks (synthetic and real-world) to each other that a reasonable comparison can be made with the same OS and software ports/optimizations.

Oh, so now we have to optimize the code to get that.................

Remember that for a moment.
I do remember that, and actually, it was jaunrga that specifically brought it up in this thread from the benchmarks and software comparisons blog that he posted.
https://blog.cloudflare.com/neon-is-the-new-black/

Optimization is important for any ISA, not just AArch64 and x86-64.
You are acting like we have been saying ARM in general doesn't need optimization, and we never said that - quite the opposite, actually.

No, you're not and that is the problem. You have decalred a winner before hand and you are trying to twist a narrative to fit that.
Are you for real?
Look back at post 69 and read what I actually wrote:

With whatever comes to pass in the next few years, I am going to refer back to this thread.
If I was right with ARM64 succeeding, I would like you to admit it - and if you were right with x86-64 continuing, I will gladly admit it. (y)
That is hardly "declaring a winner", as you stated.
You didn't reply to it, either, big shocker. :meh:

Sure. Here is my proof that the issue is an OS issue not a hardware issue for x86 or advantage for ARM.
Perhaps I should have said that, for the apps that we know work and are specifically optimized for x86-64 on Android, all but the oldest ARM-based Android tablets we have still outperform it - those Panasonic Toughpads cost thousands, so it isn't some off-the-shelf $100 tablet, and they are outperformed by tablets that are older and cost significantly less.
Do you have ANY ACTUAL BENCHMARKS to share, because this is starting to seem like borderline trolling from you at this point, which is shocking to see come from one of the best editors on here.

Come on man, we're all on the same side, I do respect you, and I'm really trying to have a legitimate debate and conversation with you - like I said, if you can prove me wrong (not through opinions, but actual benchmarks, market data, etc.) I will gladly admit that you were right and I was wrong, and I will not backpedal on that statement. (y)
ARM64, in 2019, will not beat x86-64, and I do agree with you on that - 10 years from now, however, we are speculating that ARM64 will have taken over, or at least become a major competitor, to x86-64, and that is what we are arguing.
 
And you are the guy that accuse others of non-understanding "big O notation"?

ROFL

AMD interconnects scale as O(N²). For a four-core CCX the interconnect has 6 links. An hypothetical eight-core CCX would require 28 links. The number of cores has duplicated (4-->8), but the number of links has quadruplicated 6-->28, because the interconnect scales as O(N²), not linearly as you pretend. The interconnect doesn't scale up well, not only power would grow nonlinearly, but latency would also, killing performance.

The same happens with the IF interconnect in Naples, which scale up badly. On the other hand ARM has designed an interconnect that scales as O(N). Using that interconnect they can scale beyond 128 cores, whereas maintaining power and latency under control.

Scaling is relative especially if you have different rules. Most network scaling is based on the number of hops or paths between nodes. Every hop you pay a penalty. That penalty is O(N) and is what I was referring to as scaling.

I guess you were saying, number of complexes in a package x number of interconnects = O(N) growth? I guess this is a metric, though completely meaningless.

The AMD CCX, no matter how many cores, is shared through its L3. CCX to CCX is shared via IF Infinity Fabric.

The limiting factor on a CCX is not the interconnects but rather the pressure on the L3. AMD decided 4 was the right balance, there is nothing preventing them from putting 8 cores on a L3.

If you apply the same rules, the ARM ecosystem scales via the CML Coherent Multichip Link to 2. Anything over that is an extra hop and thus another connection. Same silicon doesn't get you different consideration. (cough cough Kentsfield).

So your obscure use of big O notation to declare unique scaling is completely false.

This is why I kept asking. What makes one interconnect topology greater than another and what does ARM's mesh do better?

Hopefully this post will clear up any miscommunication.

PS a network of nodes with one hop point to point communication scales at combination C(n,2)

2 = 1
3 = 3
4 = 6
5 = 10
6 = 15
7 = 21
8 = 28
16 = 120

Not N²
 
Last edited:
Scaling is relative especially if you have different rules. Most network scaling is based on the number of hops or paths between nodes. Every hop you pay a penalty. You can't scale when you pay a penalty. That penalty is O(N) and is what I was referring to as scaling.
So even when the bandwidth (data transfer rate) is dropped from 50GB/s to 25GB/s, you are saying that is still scaling linearly?
 
How is the complexity "1" when the bandwidth (data transfer rate) is literally halved, going from 50GB/s to 25GB/s, when more modules are added?
That doesn't even make sense, and is the polar opposite of scaling linearly.

So even when the bandwidth (data transfer rate) is dropped from 50GB/s to 25GB/s, you are saying that is still scaling linearly?

Yup.

PS

f(x) = 0.5 x

Is a linear equation.

It would be best if you let juanrga and I fight this out.
 
Yup.

PS

f(x) = 0.5 x

Is a linear equation.

It would be best if you let juanrga and I fight this out.
That's a good point, I'm interested to see what you both come up with.
Thanks for explaining, that helps! (y)
 
You can have both and alternative and a killer. ARM servers are alternative to x86 servers. And Apple is choosing ARM as alternative to x86.

ARM is a killer for reasons explained before. And already mentioned even Intel knows ARM will replace x86.



You are doing the same mistakes that when you quoted Cloudfare blog. So I can do the same remarks.

First, before all, ARM doesn't win only in z7zip. It wins on SPEcint, STREAM, and whestone. For example on SPECint


View attachment 145654

Second, STH tested a non-production sample, not the final chips.

Third, STH used GCC7 as in the above case, whereas in other cases they used GCC7 only for the ARM chip and the Intel/AMD compilers for Skylake and EPYC. As is well-known the Intel and AMD compilers cheat scores as SPEC by using compiler tricks. GCC doesn't cheat, so when comparing performance on all the hardware using GCC for all, ThunderX2 wins.

Fourth, GCC7 lacks any performance improvement for ThunderX2. GCC7 has performance improvements for both Zen and Skylake, but it lacks improvements for ThunderX2, because this is a new hardware. This is mentioned in the article, which you obviously didn't read: "The flip side to this is that we did not have the same level of optimized binaries that some of the custom Arm compilers, e.g. from Cray, would provide in this test."

Indeed, Cray worked with Cavium to add ThunderX2-specific optimizations to its compiler: CCE gives up to 25% higher scores than GCC. Also Cavium has worked with GCC guys to optimize the compiler for ThunderX2. The new GCC version gives about 10--15% higher scores than those quoted in STH review.

About power consumption. The updated review shows what happens when you test non-final systems. A simple FW update reduced power consumption by ~300W compared to the original article, and final silicon will reduce an extra 30--40W as mentioned in the updated article. Also they aren't exactly comparing the apples to apples. This is total platform power, and they did the best to configure the systems to look the same, but they aren't.

Finally, not only ThunderX2 is a 16nm SoC, whereas others are 14nm, which gives power advantage to the x86 chips, but the x86 chips are mature designs made by engineers with years of experience designing servers. The history of ThunderX2 is different. Broadcomm engineers, who lacked any former experience with ARM ISA or servers, designed the Vulkan microarchitecture. Their first server and their first ARM design. Broadcom finally abandoned servers and the Vulkan project and sold the IP. This was acquired by Cavium, which renamed it to ThunderX2, and finished the design. Neither Cavium engineers are expertises on ARM or servers. ThunderX2 is their second desing, and ThunderX was a completely different beast, with in-order cores without SMT,...

Conclusion, despite a node disadvantage and lack of experience Cavium first serious ARM server matches or even beats the best x86 chips than Intel and AMD can offer.




Not making any mistake, just trying to show arm in a fair and balanced light with relatable bench marks (which is hard since barely any exist) that even show arm beating x86, but yet thats not good enough.

because of your zealotry you keep referring to your precious spec marks which mean jack .

Even in light of the wins which shows arm does have some advantages its still not good enough . you state you can have it both ways, and keep pushing the x86 killer mantra.

when clearly its not, and in light of the articles and information available arm may not have the chops to actually compete with x86, its already hot, it requires similar power to perform almost as fast as x86. and the availability is scarce.

you dont want arm compared to desktop processors yet those are the very same comparisons you are making and it falls flat on its face

every time its investigated. arm is not there.

the only arm processor that competes is that cavium. and it only competes, there is not any arm processes that thoroughly beats x86 and even if the neo verse drops tomorrow and crushes every single x86 chip out there, there is not one iota of software worth a damn that any one cares about that can run on it.

from the looks of it there is actually more enthusiasm for Power 9 over arm, in desktop applications.

arm is just what it is a lower power envelope to get certain tasks done, but when it comes to actually doing any load, its performance and power envelope is questionable whether its actually worth all the trouble of implementing it in the first place.
 
Scaling is relative especially if you have different rules. Most network scaling is based on the number of hops or paths between nodes. Every hop you pay a penalty. That penalty is O(N) and is what I was referring to as scaling.

I guess you were saying, number of complexes in a package x number of interconnects = O(N) growth? I guess this is a metric, though completely meaningless.

The AMD CCX, no matter how many cores, is shared through its L3. CCX to CCX is shared via IF Infinity Fabric.

The limiting factor on a CCX is not the interconnects but rather the pressure on the L3. AMD decided 4 was the right balance, there is nothing preventing them from putting 8 cores on a L3.

If you apply the same rules, the ARM ecosystem scales via the CML Coherent Multichip Link to 2. Anything over that is an extra hop and thus another connection. Same silicon doesn't get you different consideration. (cough cough Kentsfield).

So your obscure use of big O notation to declare unique scaling is completely false.

This is why I kept asking. What makes one interconnect topology greater than another and what does ARM's mesh do better?

Hopefully this post will clear up any miscommunication.

PS a network of nodes with one hop point to point communication scales at combination C(n,2)

2 = 1
3 = 3
4 = 6
5 = 10
6 = 15
7 = 21
8 = 28
16 = 120

Not N²

I already explained those matters to you. I also offered an example around the interconnects that AMD uses on Zen. In my example I showed you how doubling the number of cores on a CCX quadruplicates the power of the interconnect. The interconnects that AMD uses scale as O(N²). So doubling the number of cores N --> 2N implies the interconnect grows as N² --> (2N)² = 4 N². So when duplicating the number of cores the power consumed by the cores grows as 2x, whereas the power consumed by the interconnect grows as 4x, because their interconnect scales nonlinearly. AMD could do an 8-core CCX, simply both performance and power would be affected due to the bad nonlinear scaling of their design. The AMD interconnects don't scale up well.

On the other hand ARM engineers have designed an interconnect that scales as O(N), so they can scale up the N1 platform beyond 128 cores. Indeed if you take the ARM design and duplicate the number of cores N --> 2N, the interconnect power grows as N --> 2N. The power grows as 2x, because it scales linearly.

There is no any miscommunication, and I am using standard terminology in the field of computer architecture.

when clearly its not, and in light of the articles and information available arm may not have the chops to actually compete with x86, its already hot, it requires similar power to perform almost as fast as x86. and the availability is scarce.

you dont want arm compared to desktop processors yet those are the very same comparisons you are making and it falls flat on its face

every time its investigated. arm is not there.

the only arm processor that competes is that cavium. and it only competes, there is not any arm processes that thoroughly beats x86 and even if the neo verse drops tomorrow and crushes every single x86 chip out there, there is not one iota of software worth a damn that any one cares about that can run on it.

from the looks of it there is actually more enthusiasm for Power 9 over arm, in desktop applications.

arm is just what it is a lower power envelope to get certain tasks done, but when it comes to actually doing any load, its performance and power envelope is questionable whether its actually worth all the trouble of implementing it in the first place.

ARM doesn't require similar power to perform as fast as x86. We have plenty of cases showing otherwise, from mobile to servers/HPC. Apple custom cores provide similar performance to best x86 designs as Skylake but at a fraction of the power. And soon even a standard core will do. For instance the new N1 cores that ARM presented recently.

Apple is already testing the chip on a Mac, and the day that they release the chip, the software will run on it either in native mode or under emulation. No problem with software.

Power 9 is not a competitor. It is a slow, power hungry, and costly design. IBM will soon abandon hardware (just as they abandoned foundry business).

Funny that you continue with the misleading claim that ARM is only "a lower power envelope". Not only you are now ignoring all the own links that you bring to us, but you also ignore that ARM is being used in supercomputers. I find funny that you believe that doing very complex scientific computations that require tons of memory and 5000 times more FLOPS than available on your desktop built is not "actually doing any load".

I will be clear here. The ARM-based Apple Macs will run circles around any Zen2/Zen3, Icelake, whatever comes next.
 
I already explained those matters to you. I also offered an example around the interconnects that AMD uses on Zen. In my example I showed you how doubling the number of cores on a CCX quadruplicates the power of the interconnect. The interconnects that AMD uses scale as O(N²). So doubling the number of cores N --> 2N implies the interconnect grows as N² --> (2N)² = 4 N². So when duplicating the number of cores the power consumed by the cores grows as 2x, whereas the power consumed by the interconnect grows as 4x, because their interconnect scales nonlinearly. AMD could do an 8-core CCX, simply both performance and power would be affected due to the bad nonlinear scaling of their design. The AMD interconnects don't scale up well.

On the other hand ARM engineers have designed an interconnect that scales as O(N), so they can scale up the N1 platform beyond 128 cores. Indeed if you take the ARM design and duplicate the number of cores N --> 2N, the interconnect power grows as N --> 2N. The power grows as 2x, because it scales linearly.

There is no any miscommunication, and I am using standard terminology in the field of computer architecture.



ARM doesn't require similar power to perform as fast as x86. We have plenty of cases showing otherwise, from mobile to servers/HPC. Apple custom cores provide similar performance to best x86 designs as Skylake but at a fraction of the power. And soon even a standard core will do. For instance the new N1 cores that ARM presented recently.

Apple is already testing the chip on a Mac, and the day that they release the chip, the software will run on it either in native mode or under emulation. No problem with software.

Power 9 is not a competitor. It is a slow, power hungry, and costly design. IBM will soon abandon hardware (just as they abandoned foundry business).

Funny that you continue with the misleading claim that ARM is only "a lower power envelope". Not only you are now ignoring all the own links that you bring to us, but you also ignore that ARM is being used in supercomputers. I find funny that you believe that doing very complex scientific computations that require tons of memory and 5000 times more FLOPS than available on your desktop built is not "actually doing any load".

I will be clear here. The ARM-based Apple Macs will run circles around any Zen2/Zen3, Icelake, whatever comes next.


Uh no we dont we have one chip that performs almost equal to x86 ... just one ... what apple has, yet to be demonstrated in any meaningful way..

the rest of the arm chips fail to muster much of showing in any of the prior bench marks, regardless the whole issue here is that "arm is an x86 killer"

the scientific stuff has no bearing on the market as its all abstracted, specialized, and has jack all to do with x86 computing.

the whole arm is an x86 killer is an out right lie.

it is an alternative that is all, not the x86 killer that some want so desperately for what ever sick reason.

and since were all conjecturing here ... i think its foretelling that apple is moving into finance, maybe just

maybe their whole arm gamble did not work out but we will see when those chips hit the streets.
 
Uh no we dont we have one chip that performs almost equal to x86 ... just one ... what apple has, yet to be demonstrated in any meaningful way..

We have several of those chips. And several of they have been discussed here: Apple A12, Cavium TX2,..

the whole arm is an x86 killer is an out right lie.

ARM is a killer because we get same performance for a fraction of the cost or for a fraction of the power consumption. Even Intel knows that ARM will replace x86.
 
ARM is a killer because we get same performance for a fraction of the cost or for a fraction of the power consumption. Even Intel knows that ARM will replace x86.

Sorry, still don't see this happening.

ARM has no platform. Without a platform no one is going to seriously touch it in any volume outside of basic consumption devices such as phone and tablets.

ARM has no software. Compared to x86 ARM has nothing which runs on it. We already have decades worth of software which runs on x86. ARM has absolutely nothing like this and if you think re-compiling x86 software to ARM is a trivial undertaking I don't know what to tell you. It's not simple, it's not cheap and it's not fast.

ARM isn't a general purpose CPU like x86. By the time you design ARM well enough to be a real replacement for x86 it's going to be just as big, just as power hungry and just as expensive. If not more so. The only widespread ARM usage is in basic and relatively simple consumption devices running relatively simple software made specifically to be simple.

I might be able to take you seriously if you can show me an ARM based desktop running a relatively modern game with good performance while watching Netflix on a second monitor and a couple other tasks running in the background. That's not even a really heavy load but I'm willing to bet there's no way you can do it despite the fact the ancient system in my sig doesn't even bat an eye at doing that.

If you want to proclaim ARM as the end all, be all of computing and the dethroner of x86 you're going to have to show me ARM doing everything x86 can while performing better and with less overhead and resources. It's also going to have to be significantly better in all those things otherwise the current crop of hardware, platforms and software is going to die laughing at your statements. x86 is entrenched in the hardware and software world and has been for a very long time. To have any chance of supplanting x86 ARM must do all these things in order to get people to even consider dumping everything they currently have working. Simply put, I don't see it happening no matter how good ARM is.
 
We have several of those chips. And several of they have been discussed here: Apple A12, Cavium TX2,..



ARM is a killer because we get same performance for a fraction of the cost or for a fraction of the power consumption. Even Intel knows that ARM will replace x86.


you keep repeating this but it has been demonstrated as wrong .
 
Sorry, still don't see this happening.

ARM has no platform. Without a platform no one is going to seriously touch it in any volume outside of basic consumption devices such as phone and tablets.

A server platform was defined time ago (SBSA specification). And ARM Holdings just presented a developer plaftorm (N1 SDP).

ARM is already used in server and HPC although the volume is small because this journey has just begin (June of past year Cavium had didn't have final silicon for TX2). About high volume, according to IDC marketshare reports ARM is the #1 platform behind internet, and IDC gives non-client numbers, so they aren't counting phones and tables.

ARM has no software. Compared to x86 ARM has nothing which runs on it. We already have decades worth of software which runs on x86. ARM has absolutely nothing like this and if you think re-compiling x86 software to ARM is a trivial undertaking I don't know what to tell you. It's not simple, it's not cheap and it's not fast.

Most of server and HPC software has been ported and optimized for ARM. The ecosystem is on a status of high maturity today

STH-Arm-2016-to-2018-Ecosystem-Maturity-at-ThunderX2-Launch.jpg


Moreover, client linux distros as Debian have many thousand of packages ported to ARM. Microsoft starts porting its software to ARM (Windows 10 already runs on ARM). And Apple is doing the same.

ARM isn't a general purpose CPU like x86. By the time you design ARM well enough to be a real replacement for x86 it's going to be just as big, just as power hungry and just as expensive. If not more so. The only widespread ARM usage is in basic and relatively simple consumption devices running relatively simple software made specifically to be simple.

ARM CPUs are more general purpose than x86 because they can be used in applications where x86 isn't viable. Current ARM CPUs are so fast like the best x86 CPUs, but consume only a fraction of power. E.g. Apple mobile CPUs are at Skylake desktop level, but are smaller and cheaper.

ARM is currently used on very complex workloads that require tons of memory and 5000 times more FLOPS than what you have in your desktop.

I might be able to take you seriously if you can show me an ARM based desktop running a relatively modern game with good performance while watching Netflix on a second monitor and a couple other tasks running in the background. That's not even a really heavy load but I'm willing to bet there's no way you can do it despite the fact the ancient system in my sig doesn't even bat an eye at doing that.

There are some very powerful ARM desktop machines, but they aren't used to play games or to watch Netflix, but they are used for serious work

https://www.servethehome.com/gigabyte-thunderxstation-using-cavium-thunderx2-launched/

x86 is entrenched in the hardware and software world and has been for a very long time. To have any chance of supplanting x86 ARM must do all these things in order to get people to even consider dumping everything they currently have working. Simply put, I don't see it happening no matter how good ARM is.

The history of computers didn't start with x86

TOP500-Special-purpose-HPC-replaced-by-RISC-microprocessors-in-turn-displaced-by-x86.png


Software was ported from original vector machines to MIPS, SPARC, Alpha, and POWER. Software was latter ported from those machines to x86. Software is being ported from x86 to ARM now.

Apple already migrated flawless from PowerPC to x86. They are now migrating from x86 to ARM. And as mentioned above, linux distros as Debian already ported most packages to ARM in several flavors including both older 32bit and the new 64bit ISAs. The Debian stable branch has 93% of the archive ported to ARM64. The unstable branch has 98% ported (i.e. over 12200 packages work on ARM 64bit hardware). No real problem here. Only more time is needed to produce more software that runs native on ARM hardware.

you keep repeating this but it has been demonstrated as wrong .

All what I said is either well-known or has been proven here. Negationism isn't an option.
 
Last edited:
The transition to non x86 archietectures already began a long time ago in the developer realm. The more languages that are designed with a p-byte style like Java VM, C#, Kotlin, etc the less there is a need to design for specific hardware. It comes down to we just need a JVM or whatever underlying framework to be ported and the apps follow. It is the very reason when designing a compiller that you can separate the front end and back end amongst other areas. It allows closed for modification but open for extension.
 
All of whats been said in arms favor is misconstrued, its all abstracted and glossed over.

Arm is only a power house in custom designed applications. and those applications are usually scientific and have no real world bearing on actually daily workloads.

i have yet to find anything demonstrating arm replacing a serious desktop pc. (much less server) as far as getting more workload done.

even the apple a12 marks are abstracted so much that you can not draw any real conclusions from them .

the bench marks that can be related to show an entirely different picture of arm either lagging behind or using the same power envelope with way over twice the threads besting an x86 in one bench mark. but the very same processor just barely competes in other marks.

and the more i dig into this the more i fnd that arm can not reliablely run long term like x86 with multiple loads juggling multiple task simultaneously . its a one trick pony.

and thats all arm is.

a specialized processor that does specialized things when it comes actually juggling a actual pcs workload it fails miserably because it cant do it.

it can run a browser really well or it can run a certain bench mark really well, but it can not do both at the same time and even more so it cant do it routinely for prolonged periods.

that is the arm processor we have today.

singular task computing done really well thats arm and thats not an x86 killer
 
Back
Top