Code a GPU program

Question

Code a GPU program

Evan Cox

>code a GPU program
>it's slower than the corresponding CPU program

Attached: Nvidia_CUDA_Logo.jpg (300x182, 11K)

January 15, 2019 - 23:29

Benjamin Hall

>using logic with complicated branching on something that will run in a GPU
shiggy diggy

January 15, 2019 - 23:33

Kayden Green

>Want control over individual pixels for say 256 color modes or stuff like palette techniques
>have to use (((shaders)))

January 15, 2019 - 23:34

Grayson Torres

>cpu
>5 million tripcodes per second

>gpu
>25 million tripcodes per second

Why?

January 15, 2019 - 23:37

Andrew James

transfering data to and from the gpu is slow. only good use case for gpus is sending the data once and doing a lot of computations with it while it's in the gpus memory.

January 15, 2019 - 23:37

Luis Peterson

It's not that simple you mong. You have to code it in a way that instructions don't run in series. Instead you have to make a majority of it run in parallel which is batshit insanely hard.

Even making programes run on more than 2-4 cores efficiently is already a daunting task (see gaymen), good luck getting shit to run on 2,000+ cores efficiently.

Attached: cpu-vs-gpu-presentation-14-638.jpg (638x479, 103K)

January 15, 2019 - 23:41

Aaron Reed

>make program that runs on 6000 processors with a super low clock speed
>slower than a program that runs on 1 processor clocked at least 10000 times higher
WHEW LAD TELL ME MORE

January 15, 2019 - 23:45

Oliver Powell

absolute state, etc
also kys tripfag

January 15, 2019 - 23:49

Jacob Ortiz

Memory is slower and each thread is slower. A GPU is only better when your problem is extremely parallel in nature.

These

January 15, 2019 - 23:52

James King

It should also be noted that at the end of the day despite having an assload of coars consumer GPUs are not that much more powerful than consumer CPUs in terms of FP64 math. While they have magnitudes more FP32 compute performance it makes it even harder to write efficient software for them(see rounding errors). AMD GPUs generally have more FP64 but coding for CUDA is easier. Once you get to 8-16 core modern AMD/intel stuff GPU acceleration becomes less attractive to devs especially given intel's 512-bit AVX i9s.

General AVX FP64 compute of haswell 4-core processor: 112 GFLOPS

Theoretical FP64 compute of a gtx 2080ti : 420 GFLOPS

Theoretical FP64 compute of a Quadro GV100: 8,330 GFLOPS

That's why those fuckers go for ~$10,000

Attached: CPU-Floating-Point-Test-AMD-vs-Intel.png (568x499, 10K)

January 16, 2019 - 00:28

1 2 ... 7 Next

Code a GPU program

Last threads