THANK YOU BASED INTEL

anandtech.com/show/12773/intel-shows-xeon-scalable-gold-6138p-with-integrated-fpga-shipping-to-vendors

THANK YOU BASED INTEL

Attached: Intelfpga_678x452.jpg (678x451, 44K)

>does absolutely nothing for gaming
Do you know where you are?

not on /v/

>costs an arm and leg

LOL Intel BTFO'd by ANANDTACK THEMSELVE.

It'll probably cost at least 15k USD, so yeah...

Literally less than the monthly salary of a single employee working on things that require such a CPU.

>from 42 to 32 pci lanes
>thank you based intel
what an improvement

you know that if you actually NEED those enterprise components they literally pay in no fucking time for itself, right?

AMinDians BTFO

>integrated fpga
why would i ever want this when you can get cyclone ii or cyclone v pci express addon cards? intel doesnt know what the fuck they are doing...

What is this shit doing? Some additional compute? How is it better than uding gpus for compute? The only benefit of avx is that it is running on cpu instead of offloading kernel into gpu itself to get results, if it requires same offloading without direct access from program thread it's completely pointless as i understand.

How are you supposed to delid this when there's 50 resistors in the way?

kek

thanks you intsrael

Attached: jew.jpg (1162x850, 483K)

>they literally pay in no fucking time for itself, right?
Intel should raise their prices then

Even after reading the article, I still don't understand why it's better to place the FPGA on the CPU die instead of just using a PCIe card for the purpose. The bandwidth is the same anyway, and the FPGA would still have access to system memory over PCIe.

Is this just making the system more monolithic just for the purpose of being more monolithic?

>Is this just making the system more monolithic just for the purpose of being more monolithic?
Pretty much. Intel has a massive throbbing boner for monstrously huge monolithic dies.

Does this mean we can have perfect hardware emulation for SNES with no hit to CPU performance?

it takes hundreds of CPU cycles to copy memory from computer memory into GPU memory, for non-pinned memory writing 128MB is on the order of several milliseconds. you're absolutely right that in a perfect world with proper code offloading especially with the emergence of OpenACC, GPUs will shit on trying to add more vector extensions on a processor in both power efficiency and computational speed but the software just isn't there yet and it's actually a serious problem as GPUs are very underutilized (barring mining). on the other hand, this FPGA w/ DMA is essentially a whole new world of vulnerabilities waiting to happen. imagine being attacked by both assembly side channel attacks as well as FPGA verilog attacks LMAO

>what is latency

there are already latency differences between L1 and L3 cache, and further latency when you go to RAM or 3dXpoint, further latency when you get to PCIe cards and PCIe SSDs.

Having something closer to the CPU die means less latency, it's as simple as that.

If it's hanging off the PCIe bus then the latency will be similar to a full height PCIe card on the motherboard regardless of it being close to the die, distance travel isn't actually that big an issue.

Take this for a good example, it takes more time for a Skylake-SP core to talk to another core on the other side of the die than for a coffelake CPU to talk to the main memory.
Similar goes for AMD's IF and die to die latency, DRAM talking is faster and it's much further than a EPYC die.

>If it's hanging off the PCIe bus then the latency will be similar to a full height PCIe card on the motherboard regardless of it being close to the die, distance travel isn't actually that big an issue.
would love to see a source on that

We're talking about latency in the order of singular CPU cycles.


For skylake, L1 cache latency is ~5 cycles
L2 latency is ~12 cycles
L3 latency is ~42 cycles (kabylake brought this down to ~38)

RAM latency is ~42 cycles + ~50ns.

PCIe latency is ~80,000ns (0.08ms)

If you've got a high end HPC setup with NVlink, that brings the GPU access latency down to around 30,000ns (0.03ms)


By bringing the FPGA on die, you're lowering the latency from ~50,000-100,000ns down to ~10,000ns or less I haven't taken a hard look at intel's approach, but I can't imagine the latency is anywhere near what PCIe latency would be.

What's slow is the PCIe protocol and handshake, not the physical distance, being on the die will lower it somewhat.
Regardless it's all close to the same shit if the FPGA is hanging off the PCIe bus and not some custom interconnect.

I wonder what *coin mining would be like on this. I wouldn't have much use for it otherwise.

Intel also makes dedicated chips for coin mining, no need to get this at all

>FPGA is hanging off the PCIe bus and not some custom interconnect
Well what does this mean then?

>ntel is connecting the Xeon processor to the FPGA with 160 Gbps of bandwidth per socket (doesn’t state if this is bi-directional) using a cache coherent interconnect. From the way that we know that the Intel OmniPath Fabric connects in package to an Xeon, this connection likely implements a different protocol over the PCIe x16 interface reserved for in-package components, but also takes advantage of Intel’s Ultra-Path Interconnect (UPI) for cache coherency and access to data across the platform. This may mean that this reduces Xeon+FPGA setups to dual socket at best, if one UPI link from the processor is in use for the FPGA, however Intel did not provide briefings on the new parts to confirm this. We can confirm from an old Intel slide that the platform should be using a High Speed Serial Interface (HSSI) for connectivity; this slide also states that the new processors have different power specifications to standard Skylake-SP sockets, and as such the Xeon Gold 6138P is probably unlikely to be a drop in processor to current systems.

Then it's not using the generic PCIe protocol then, its latency will be much better most likely.

Also, they're working on standalone FPGA add-on cards that use a custom interconnect besides PCIe. Though there are no official details announced about this yet.

Alphabet can finally build Skynet now.

Most ASIC resistant coins (the way things are going now) will mine very ineffectively due to poor memory bandwidth, but Bitcoin, maybe. Write some Verilog and start infecting enterprise systems and tell me what you get.

Botnet Inside™

>tfw backdoored flip-flops

>you can get cyclone ii or cyclone v
just get a zyklon b, m8

>t. |>
>᠌ ᠌ ᠌ |3

Oh god please to make this mainstream... The last thing we need is this new culture of *software engineers* coming into hardware space and shiting it up.
I like VHDL and Verilog and see no need for higher abstractions (think C) and safe handling of critical paths or whatnot. Which leads to bugs and backdoors.

>15k monthly
You're delusional lmao

You two are full of shit. It's going to be priced around ~2000-3000, roughly the price of an Intel CPU plus a decent Altera FPGA plus overhead, otherwise no one would buy them. The point is to make them accessible, these aren't for smoke testing your enterprise SOCs.

Normal "software engineers" will get reamed thoroughly by the complexity of actual hardware programming, don't worry bud

>OmniPath
Is this Intel's infinity fabric?

Custom interconnect not to mention potentially more logic elements

The cheapest FPGA from this series is $4k...

Attached: FPGASkyl.png (1543x801, 164K)