Ryzen Neural Net Prediction and Smart Prefetch with benchmarks?

AMD_Gamer

Fully [H]
Joined
Jan 20, 2002
Messages
18,287
How would the Smart Prefetch and Neural Net Prediction features of Ryzen work with benchmarks? Does the CPU need to run a particular program for awhile before those features start to make a difference? I was wondering if the those features make a difference in benchmarks and how you can actually tell if those features are making a difference?
 
Mark Papermaster said in an interview that as neural net prediction learns, performance will increase. It wasn't clear on how much performance increases or how much code it needs to 'learn'.
 
First glance, and I could be dead wrong on this. :)



NNP looks like its centered in the branch prediction portion of the core -- so maybe some type of hashing table internally for instructions with counters as well as addresses/data in the load/store queues for cached fetches (maybe also integrated into the TLBs and caches for higher priority storage and/or tiered time-out/discard).

Ultimately, better branch prediction has been Intel's mantra of improvements over the CPU families over time (aside from other changes) so this can only help as a BP miss hurts (HT can cover that up somewhat by having other instructions in flight through the pipeline).
 
So what happens if you keep running a benchmark in a loop? Could you see the results go up as the CPU learns about the benchmark?
 
So what happens if you keep running a benchmark in a loop? Could you see the results go up as the CPU learns about the benchmark?

It depends on what factors are contributing to performance if it is about moving things from a cache to another while doing the benchmark it could just leave it in the cache which shows more optimized results.
What AMD is describing is nothing new to the cpu in general if you want to bore yourself to tears check pcper video on Zen with Kanter he tells you a little bit on how this is working.

Most people regard this as marketing buzzwords to allow people to think of this as something special that the competition does not have.
 
From the PCPer interview with David Kanter, he was fairly certain that they are using a type of branch predictor called a "Perceptron Branch Predictor". Here's a good presentation I found that details how these work:
https://www.jilp.org/cbp/Daniel-slides.PDF

So while it technically is a neural network, it is quite simplistic and likely does not take too long to be trained for a given workload. You may get faster performance if you run a benchmark a second time right after the first, but I would imagine that during the first run the branch prediction will quickly reach maximum performance. For each branch instruction the more times the CPU encounters that particular instruction the better it will be able to predict the outcome. The accuracy of the prediction is probably a logarithmic function where %A ~= Log(N) where N is the number of times the predictor encounters a particular branch.
 
I'd expect the effects of this technology to operate on microsecond scales.

It might or might not persist between context switches. Certainly though it wont be anything affected by running stuff twice in human timeframes
 
Back
Top