>unknown IPC uplift over Zen1 >unknown transistor count increase >unknown clockspeed increase >will be 7nm HP from TSMC >Process touted as offering up to 45% higher perf compared to 14/16 class nodes >more than the low power 7nm process >EPYC2 based Rome will be in the hands of some big OEMs by the end of December if they don't have test samples already
For a 8c/16t 95w part I'd conservatively bet on a 4ghz base clock at launch, with turbo around 4.5ghz. I expect OC headroom to be a lot higher than 14nm LPP or the 12nm parts. And if Rome has a 64c/128t part has a halo SKU then I'd expect 8 core CCXs.
If they hit a 10-15% IPC increase intel will be hurting in pretty much every market. Higher IPC and a big bump to clocks will put them right at the top of gaymen benches in addition to dominating enterprise workloads.
Daniel Gonzalez
how about AVX 256/512bit ?
Logan Sullivan
On one hand AVX512 is pretty low on the list of things that matter to consumers, on the other hand having an FPU actually capable of processing these large vectors would also have massively higher throughput for smaller ops as well. I don't know if AMD has any plans to increase FPU width, but I think they'll eventually be forced to.
The trend we saw at HotChips this year was tons of new core arch all centered around huge vectorized workloads.
Luke Thomas
It's a very conscious, deliberate, and sensible choice for Zen to have been designed with a 128-wide AVX path.
Intel's wide AVX2/AVX512 paths use so much power they have to be powered down when not used, and the whole chip has to be clocked down when they are used. The unspoken result is that if you are doing anything at the same time that does not require AVX2 or AVX512, you get worse performance than if you weren't using it. The net result: whole-program optimised code actually (properly) _avoids_ using AVX2/AVX512 in some cases.
The other unspoken result is that since unchoking and powering up the AVX arrays is done speculatively (because otherwise it would cause a huge latency hiccup), this causes an exploitable Spectre side-channel leak. Doing it non-speculatively would... cause an exploitable remote side-channel leak.
It remains to be seen if that kind of power gate delay can be exploited with any other coprocesssors/subprocessors in modern designs, but almost none of those are as big or as unwieldy as Intel's gigantic AVX ALUs.
Kayden Miller
Intel's FPUs definitely do run hot when crunching 512bit AVX, the power spike is pretty significant as well. I think they were banking on it being more viable at smaller process nodes. Obviously that didn't work out for intel and their 14+++++++ refreshes.
Gavin Powell
bump for the drama
Luke Robinson
Nope, making 8-core CCX would require a lot of crossbars and the die will also be way bigger
Isaac Fisher
>die will be bigger >7nm vs 14nm >what is scaling factor
Either CCXs grow in core count, or they add more CCXs per die. Either way complexity increases, theres no way around that. Rome isn't going to have 8 or 9 dies on package.
Lincoln Reyes
chiplets are the future.
Evan Cox
>7nm vs 14nm It's TSMC's 7nm -> GloFo's 14nm, they will most likely go with more 4 core CCXes per die, no need to do a CCX redesign
Sure, but theres still a limit. Just like with triple and quad patterning in lithography, you exponentially increase complexity with every pass. With every chip added to a package in an MCM you dramatically decrease yields.
Asher Murphy
Why do you think they call it infinity fabric? If on 7nm they can reach 64 core CPUs on 5-3nm they could probably go 128 cores
Oliver Allen
Protip: You have no idea what any of the words you're using actually mean Having a data fabric capable of scaling to N dies doesn't mean that its possible to manufacturer it. Getting a package with 8 dies, or 8 core dies, and 1 die that has external memory controllers, would have magnitudes lower yields than a package with 4 dies. Infinity Fabric has literally nothing to do with this. Nothing.
Joseph Brooks
bump for perfect cpu to encode my vietnamese cartoons with
Brandon Cruz
i will convert everything to av1
Landon Williams
>Having a data fabric capable of scaling to N dies doesn't mean that its possible to manufacturer it Says you
that's the idea. i read in av1 thread that they only really support mp4 containers though you can use mkv with av1 based on wikipedia. dunno how that will affect compatibility, though.
Christopher Wright
Nah intel shills will just overclock their housefires to disastrous 5.8GHz and burn their neighborhood to the ground just to prove they're still faster
Xavier Wright
Well they will be releasing yet another generation of 14nm parts, so trying to squeeze out a few more mhz is pretty much all they can do.
Jacob Martinez
I'm hyped for Zen2. I've been an Intlet for my whole life but now I'm seriously considering getting AMD
Chase Foster
I really don't think Intel will ever recover to be honest. Just the thought of Rome having 64 cores is enough to make me think they will never beat AMD, the gap is just too big.
so you're buying a server farm? nothing can encode av1 now at acceptable speeds, aside from that.
Robert Turner
because hedt chips are compared with hedt chips. xeons are way out of the price range of those.
Gavin Myers
Do you see any Epyc there?
Lucas Nelson
Kool aid drinking fag.
Thomas Cooper
>just wait
Tyler Barnes
Reminder that intel has nothing to challenge Zen until at least 2021~ Reminder that intel continued the lifespan of the 14nm process in their Xeon lines which means 10nm is still broke city incapable of producing big parts.
Reminder that Zen3 with 8 channels of DDR5 will have 307.2GB/s of memory bandwidth with 6000mhz DIMMs and support multiple TB per socket.
>unknown IPC uplift over Zen1 10% worse case, 30% best case prob somewhere inbetween >unknown transistor count increase enough, if its 16c dies its going to be at least 2x higher, maybe a incremental increase if its small 8c dies >unknown clockspeed increase if its a 16c die then prob small, if its a 8c die 5ghz will be a realistic boost/all core overclock speed
housefire meme shit
Dominic Edwards
There a nice paper for you on this topic, it's not has bad has you think.
Since the interpose doesn't need to be on a current low yield Tech and can often just be simple older high yeild notes in the 96% yield rating and with very little active compononent in them would increase the overall performance.
Owen Adams
4/4.5Ghz would be such an abysmal improvement for this big node jump.
Daniel Evans
This, the jump from 14nm to 12nm was roughly +300mhz, 7nm should give at least a +400mhz clock bump over 12nm
David Wright
>TSMC’s 5nm Fin Field-Effect Transistor (FinFET) process technology is optimized for both mobile and high performance computing applications. It is scheduled to start risk production in the second half of 2019. So 7nm Zen 2019 and 5nm in 2020?
Samuel Bennett
Since it's going through the HP process, there's likely going to be a frequency increase.
Brandon Price
Zen 3 when? Is it 2020? 5nm basically 7nm+ right?
Angel Campbell
For TSMC their 5nm node is still FinFET as far as I know. Only IBM, GloFo, and Samsung were zealously pursuing GAA as early as possible, and GloFo dropped out of the race. IBM sold their foundry business to GloFo, so the only one left was Samsung. I think Samsung stated they're not transitioning to GAA until 3nm, though their 3D stacked NAND design actually is a GAA structure, so they have a lot of experienced building them on larger nodes. As of yet I haven't heard much about TSMC's 5nm plans, aside from the fact that they have them. Its unclear if they'll have a whole new BEOL, if it'll be a half shrink with some performance boosters, or what.
The 5nm GAA process that IBM/Samsung/GloFo worked on had 40% higher FMAX than their 7nm FinFET though. Shame it'll never materialize with Global Foundries.
Jack Nelson
Xnm is just a marketing meme, it doesn't measure anything inside the chip. And there's no reason to believe frequency scales linearly with feature size, because even tho the electrons travel shorter distances, you can't use voltages as high as with a bigger transistor.
Adam Martinez
Smaller gates don't need high voltage to hit high frequency, so half of your comment makes zero sense. Drive voltage is always lower with smaller devices. Gate delay itself doesn't necessarily limit clocks in any way.
Matthew Foster
>Gate delay itself doesn't necessarily limit clocks in any way. Then what limits the frequency at a certain voltage?
Josiah Watson
Nuances of the process itself. The transistors themselves, their w length/physical surface area, their effective electrical control over the channel, the metals used, their insulation, etc. There are dozens of variables that contribute to what clocks a process can achieve.
Asher Reyes
But how can decreasing the size of the transistor while keeping everything else constant increase the maximum frequency if not by shortening the time the electrons take from one point to another? And my first point wasn't just that you had to lower the voltage. It was that maybe as you shrink the chip there are diminishing returns in the frequency increases, for example if maximum voltage scales non-linearly with size while frequency at X voltage scales linearly or even non-linearly with opposite convexity.
Jayden Flores
Transistors are analog devices, smaller ones are more sensitive to changes in voltage, so they switch reliably at lower voltages. They also proportionately have more of an issue with the off state not actually being off for what its worth. Shorter channel devices can actually have longer gate delay depending on the type of structure. It isn't an artifact of feature size alone.
Jayden Edwards
>I really don't think Intel will ever recover to be honest. Just the thought of Rome having 64 cores is enough to make me think they will never beat AMD >a billion dollar company will never recover >when a smaller company like AMD recovered after getting destroyed by Core2 and Corei7 for decades Literally delusional AMDrone
Even with node advantage AMD hasn't eaten a significant market share while Intel can't produce enough 14nm+++++++++++++++ because they are in such high demand
You never hear Ryzen, Threadripper or Epyc shortages because nobody save for a few buys them, companies would rather buy discounted Xeons than buy Epyc
Tyler Morgan
AMD does AVX256 by 2 x AVX 128 I believe. The Zen2 will probably enable AVX512 by doing 4 x AVX 128. Since AVX512 recently came out for Intel, AMD can be forgiven for not implementing it. But there is a good chance for it to be on Zen2.
Gabriel Hernandez
UMA
Isaac Baker
There is no way the turbo is only 4.5ghz It's 4.35Ghz on the 2700X and 4.45Ghz on the 2950X.
It'll almost surely be 5Ghz. It's confirmed to be on the HPC process.
Doesn't even need higher IPC. 2700X IPC is less than 3.5% behind the 8700k. Just needs higher clocks and lower memory latency.
Charles White
>Implying that security issues on Intel CPUs haven't done massive damage to Intel's prestige and reputation on the Enterprise/SMB markets.
>Intel Marketing Shill in complete damage control
Major OEMs are already bitching about 14nm supply issues a.k.a Xeon SKUs.
There simply not enough units on the market and that's why Xeon SKUs such insanely high price points. (Protip: Intel's shareholders aren't going to allow massive discounts either because they hooked on the massive profit margins from nearly a decade near-dominance)
F500 companies in an upgrade cycle aren't going to wait 6+ months on back orders. They are going to be getting Epyc servers and will not look back at Intel solutions.
> AMD is going to snag more far market share in the enterprise/SMB world then they ever did with Opterons within the next five years and Intel is completely powerless to stop it.
Matthew Bailey
Hmm ok I'll trust your word
Xavier Gray
I was just throwing out conservative estimates. 4ghz for base clock would only be 600mhz away from the 1800X at launch, and I have a ton of data on power per core and voltage scaling for launch Summit Ridge. They should be able to hit 3.5ghz under 1v.
Jaxon Baker
>Zen3 will most likely be on a new socket with DDR5 since the spec is final and early samples are already in the wild >Low power DIMMs hit 5500mhz >standard power DIMMs hit 6000mhz >a dual channel CPU with early sample DDR5 could have twice the bandwidth compared to the average DDR4 kit today. >wide IO memory will probably be standard as a L4 by then if EDRAM isn't ubiquitous
I for one look forward to the future of AMD computing
Gabriel Ortiz
when is zen2 likely to drop? given nvidia shat the bed on pricing I have some money for computer upgrades in the next 18-24 months.
Gavin Carter
Word was that Epyc 2 enterprise chips would sample end of this year. They'd probably be available in full volume Q1 or early Q2 2019. I'd expect desktop Zen 2 chips in the same time frame.
Cameron Rogers
As of Ryzen 2k OCing is basically non-existent. You can expect any future desktop processors to be pushing the limits of the process and then some if the motherboard permits it.
Leo Hill
14nm LPP and 12nm had little OC headroom because of the process they were based on. GloFo had to make a special vt for AMD just to get clocks above 3ghz because of how the process scaled with voltage. The Zen arch wasn't limiting clocks, the low power ARM SoC oriented process was the limiting factor. TSMC's 7nm HP node doesn't have the same issue, or its significantly lessened.
Aaron Bailey
>Major OEMs are already bitching about 14nm supply issues a.k.a Xeon SKUs.
Let's be real. That's because Intel is in very high demand and companies everywhere are expanding out their servers. It's the same reason why RAM is so expensive. It's a demand issue as evidenced by the fact that components are fairly inelastic right now, eg corporations still buying up RAM like candy at inflated prices, or Intel's increasingly inflated prices. AMD has benefiting from the rise in demand too but that's because intel is becoming too hard for systems OEMs like HP and Dell to get. If the trend continues AMD will suffer similar "problems", but you won't be saying OEMs are bitching about supply issues because you're a fanboy.
Connor Foster
>14nm LPP and 12nm had little OC headroom because of the process they were based on This is untrue. They had little OC headroom because the Zen designers spent a ton of time working on the boost and frequency scaling behavior of Ryzen, as evidenced by the fact that they let the CPU core voltage go as high as 1.5V in short bursts to hit the 4.35V boost. What we can expect is a safe base clock in the mid-high 3.x Ghz, with a moderate all core boost in low to mid 4.xGHz and a very high single core XFR boost in short increments at or above 5GHz. I also wouldn't put much stock in the supposed 40-50% performance increase. What other companies found was going from TSMC's 16/14FF to 7 low power was an almost negligible increase in frequency, even with HP I doubt they'll get a lot more and those measurements are usually done on relatively small cores like an old ARM reference SoC. Around 5GHz is a safe bet. Difference is GF 7LP was based on IBM tech designed for their power guzzling mainframes, TSMC despite their high performance claims doesn't have that sort of pedigree.
Liam Johnson
No, Zen only had limited XFR headroom because of the process the chips were made on. XFR is effectively auto overclocking, it has sensing paths for voltage, so it can tell if the clock its trying to achieve ill be stable before the settings are finalized. Limited clocks are an artifact of process, not core arch.
Flat out you're talking out of your ass.
Thomas Anderson
My 1600X does 3.7ghz on 1.0875v. These should do 4ghz on under 1v. Wtf. It's twice the density.
Cooper Ramirez
>unknown IPC uplift over Zen1 ipc stays the same (remember amd can do 6 instructions vs 4 than intel )and amd jacks up the clocks to match intel >unknown transistor count increase doesnt really matter since there is no possible way to understand or compare since there isnt anything on 7nm except for the zen 2
what matters is >increased cache on all aspects >smaller IF breakthrough >slap ISA for every avx instruction up untill 512 and offer full support not half precision >5.0ghz should be achieveable for the majority of the chips and not creating another market by artificially selling shitty chips >we dont fucking know what kind of process amd sampled from tsmc but history taught us that amd never goes with either lpp or hp its always in between >epyc 2 is already on the hands of certain organizations including cern when the time comes (i assume in 3-4 weeks ) i will release some photos of it
the problem is amd at 7nm is probably already close to removing 150watts/cm2 plus add to that x86 is becoming the main problem of the pc industry with its retarded limitations(up to 8 cores it can scale perfectly but after that the ISA essentially bottlenecks the entire pipeline ) there are also plans to jump on carbon nanotubes since you can literally just copy and paste the entire cpu PT into them and create an identical but currently it costs 75% more to produce.. then you have the biggest problem all the uarch we see today cpu..gpu...arm bla bla is literally based on neumann uarch a fucking 80 years old uarch that is still the base for everything that has a logic on it..
Julian Cooper
>plus add to that x86 is becoming the main problem of the pc industry with its retarded limitations(up to 8 cores it can scale perfectly but after that the ISA essentially bottlenecks the entire pipeline ) That is interesting, you got any sources for that? I really wanna read more.
Jace Carter
its debateable till this day...the only source you can get is to find a serious developer and ask him how many manhours a company needs to make a program that perfectly scales above 8 cores...
tl dr think of it like how async works on amd..but on sync....add to that the fact that cpu cores arent really capable of flushing flipping and switching on the fly and you get the idea...(but eventually we will end up having cores that work best on gpu's into the cpu's in the future )
Jaxon Wilson
>comparing marketing and economy predictions to physic limitation Cool story bro
Hunter Johnson
Intel can't compete, and can't keep up with yields even as AMD starts dominating the new PC market (AMD has actually been picking up a fuck load of market share, both in PC and enterprise, not that you'll believe that) because intel's still, STILL suffering from low yields per wafer, while AMD is enjoying something stupidly high like 80-85% yields on chips they can use in the TOP END 1800x, 2700x, or TR level chips.
Caleb Howard
>its debateable till this day...the only source you can get is to find a serious developer and ask him how many manhours a company needs to make a program that perfectly scales above 8 cores... But that's not about the ISA. It doesn't matter what ISA you use in this case because what's difficult about scaling above 8 cores isn't the instruction set or the CPU. What's difficult about that problem is to sufficiently decouple your program's interdependencies to the point where it can all run in 8+ parallel threads. This is a difficult problem because so many tasks *inherently* depend on the result of what has already happened, and the difficulty is to find alternative solutions that achieve the same result or decouple the existing solution.
I really thought you knew something about the x86 architecture that I did not here, because I thought about it and couldn't really think of anything about the instruction set that was problematic in this regard, but it turns out you did not.
Luke Mitchell
As for what you said in your other post >then you have the biggest problem all the uarch we see today cpu..gpu...arm bla bla is literally based on neumann uarch a fucking 80 years old uarch that is still the base for everything that has a logic on it.. I have no idea what you would otherwise suggest than a Von Neumann architecture for general computation, but regardless of that, Von Neumann behaviour is just a facade on modern processors that is practical for the programmer.
x86 processors today run a frontend interpreter that translates x86 instructions to an internal micro-op, which is an entirely different set of instructions that allow Intel and AMD to work with them much more easily and these instruction sets are most decidedly not Von Neumann.
Some of the things they do on top of micro-ops are to reorder data fetches, instructions etc. completely dependent upon what would be the most optimal, and so that way you can fill up the CPU pipeline in the most optimal way (e.g. by already having all the data ready before a computation runs). Other behavior implemented on top of these micro-ops is speculative execution, intended to always keep the pipelines full.
Both of these behaviours are absolutely not Von Neumann behaviour. A true Von Neumann architecture would run all instructions in-order and would never speculatively execute (would wait on branching).
Carter Gutierrez
These ellipses... poster's gotta be 50+
Jose Nguyen
ofc its a problem of the ISA x86 was made to conserve energy as much as possible over the years with every hack they introduced it has become a clusterfuck
Ryder Lee
>ofc its a problem of the ISA It literally isn't. Concurrency is just as hard on any other ISA too, whether it be DEC Alpha, SPARC or ARM. The problem isn't the ISA of either of these, but rather that parallel execution is fucking hard and requires decoupling of interdependence in your problem.
>x86 was made to conserve energy as much as possible What
>over the years with every hack they introduced it has become a clusterfuck Yes, x86 is absolutely a clusterfuck, but no-one can say it isn't high performance.
plus how would you know that it is high perfomance? x86 was an isa specifically made to save energy and nothing more over the years and with the addition of countless extensions you can say that it has increased its perfomance but the energy saving features of it essentially are gone
Matthew Garcia
I'm surprised no one has mentioned the chiplet+uncore rumor yet. Do you more knowledgeable people think that AMD could have went with a 14nm central uncore with 7nm core chiplets as described by some sources? What do you think the advantages/disadvantages will be, if they did pull a surprise and went this route?
Isaiah Smith
Of course not, everything he said is plain bullshit.
Brayden Lewis
because removing the tie of memory worked so well for the 2990wx under windows right? fact is unless windows fix their shitty SC no amount of advancements will ever be enough since for whatever reason it only helps intel chips
Charles Jones
That tune is quickly changing in light of the loads of hardware-level security flaws.
The whole "nobody got fired from buying Intel" meme got completely BTFO. The supply issues is just icing on the cake.
You can easily tell by how much Intel marketing is working overtime on damage control campaigns that are pretty much amount to "PLZ USE US! AMD HAS BAD CHIPS AND SUPPLY ISSUES!"
In light of recent developments is deliciously ironic.
AMD's "chiplet" gambit is paying off massively in the SMB/Enterprise world. Intel was going the same route after having trouble making Broadwell-E and Broadwell-EP en mass but AMD beaten them to the punch.
Cameron Martin
insufficient registers partially transparent on devs frontend decoding and sequencing is such a mess that requires quite a lot of hardware pipestages no defined sw/hw interractions fp still needs to use a flat model because of the limitations of the x87 IA 64 was some order of magniture better but the typical intel bullshit more or less made it irrelevant and the x64 extension got adopted not to mention it was a vliw and itanium had simd similiar to an ati 4xxx but less advanced for it to be able to clock higher..
Thomas Ross
Micro ops that the arch can process per clock and observed IPC in varied workloads, measured in benchmarks used by every reviewer ever, are not the same thing.
Your entire post is full of bullshit and is a prime example of Dunning-Kruger.
Michael Gray
UMA DELID CIA
Ryan Jackson
Noone was ever able to write a decent compiler for ia64 whilte the arch was actually relevant. On top of that, the x86 compat/emulation was a shitshow. If you had the resources to hand-code assembly, good for you, you were actually able to utilize the ia64. Otherwise, shit son, you were out of luck. The only good use for it was physics/weather simulation.
AMD projects they'll have 5% of the server marketshare by end of year. It's big for them but overall its next to nothing.
Kayden Gonzalez
The compiler was more than decent - it was superb, excellent whatever superlatives one can find. It's just that rewarmed VLIW is suitable for the same tasks as any VLIW - technical computing. No static compiler will change that fact.
Benjamin Wilson
Reminder that EPYC-based machines didn't actually start shipping to normal customers until 2Q this year. Unless you were a big player like Amazon/Google/MS/Facebook, you literally couldn't even backorder them. Based on typical server lifespan, that could be between a 1/5 to 1/3 of total sales this year. That's pretty significant.
Cooper Ramirez
If the electricity used to power the vacuum is produced by nuclear power, does that not count?
Daniel Harris
Name one compiler available in 2003-2004 (because that's when amd64 came out and everyone and their dogs sighed with relief that they don't have to deal with itanium insanity) that could actually handle the insane rules of ia64 instruction packing with general (non-physics/rendering/other insanely parallelizable) code. Protip: you can't. The architecture was over-specialized, hence shit for general computing.