Why is gcc so shit

why is gcc so shit
godbolt.org/z/SXxvqp

Attached: what the fuck.png (1920x969, 168K)

Other urls found in this thread:

godbolt.org/z/N3bGWz
blog.regehr.org/archives/330
intel.com/content/www/us/en/embedded/training/ia-32-ia-64-benchmark-code-execution-paper.html
godbolt.org/z/XY6jpq
godbolt.org/z/DvxgYq
godbolt.org/z/hfr7Lu
twitter.com/NSFWRedditGif

> while (count) {
result += counter;
counter++;
if (counter > count) {
break;
}

What an utter cringe, are cniles this retarded in general?

Yeah, it's disassembled translated to C, you see the thing is then I wrote an improved version (shown on the picture) and it's still retarded coming out of GCC.

B-But user MUH COMPUHLER OPTIMUZATIONS!!!

Can Ctards compete?
godbolt.org/z/N3bGWz

Attached: Screenshot from 2019-08-23 19-09-59.png (3840x2160, 350K)

More like why are all compilers licensed and not public domain? They all have disadvantages. Instead of ""freedom fights"" and competitor-gimping, we could have had one perfect multiplatform compiler with all optimisations.

Competition in itself is a good thing. It drives compiler developers to actually deliver in order to stay relevant.

But now GCC is only relevant because
>muhh freedom use my restrictive license

Ctards BTFO (again)

Attached: 1535031366052.jpg (1679x586, 164K)

That alone is reason enough for it to exist. Copyleft licenses are actually less restrictive.
I already know you pretend not to understand this topic and you will never be convinced, so bear in mind that I won't bother replying to any subsequent responses about it.
Explaining things to someone who will refuse to listen is pointless anyway.

>more instructions = slower
The code generated by gcc uses SSE instructions. It's most likely way faster than the naive loop clang generated.

Attached: 1504821964271.png (609x714, 99K)

>optimized routine is bad
how much soi do you consume on a daily basis?

>the average Clang shill

bonus kek

Fucking brutal

Call me when clang implements hardening flags nearly as well as gcc does. It's not even faster ao.

pretty sure I typed lmao

NOOOO

DEALLOCATE THIS

Attached: 1544729016111.png (1098x1126, 492K)

For some reason the gcc assembly works and is almost twice as fast as the clang code for me on Windows, but crashes on Linux if you call the function with n >= 40. I'm probably doing something wrong, though.
#include
#include
#include

// CLOCKS_PER_SEC is 1000000 (microseconds) on POSIX and 1000 (milliseconds) on MinGW
#define TIME_UNIT "us"

// object file linked from assembler output
unsigned int disassembled_sequence_gcc(unsigned int);
unsigned int disassembled_sequence_clang(unsigned int);

int main(int argc, char *argv[]) {
long int i;
long int count = 1000;
clock_t t;

if (argc > 1) {
char *endptr;
long int usercount = strtol(argv[1], &endptr, 10);
if (endptr != argv[1]) {
count = usercount;
}
}
printf("Timing %ld calls...\n", count);

t = clock();
for (i = 0; i < count; i++) {
unsigned int result = disassembled_sequence_clang((unsigned int) i);
}
t = clock() - t;
printf("clang time: %lu " TIME_UNIT "\n", t);

t = clock();
for (i = 0; i < count; i++) {
unsigned int result = disassembled_sequence_gcc((unsigned int) i);
}
t = clock() - t;
printf("gcc time: %lu " TIME_UNIT "\n", t);

return 0;
}

(gdb) run
Timing 1000 calls...
clang time: 199 us

Program received signal SIGSEGV, Segmentation fault.
0x00000000004005ec in disassembled_sequence_gcc ()
(gdb) disassemble
=> 0x00000000004005ec : movdqa 0x79(%rip),%xmm1 # 0x40066d
0x00000000004005f4 : lea 0x1(%rdi),%ecx
(gdb) frame 1
(gdb) print i
$1 = 40

>movdqa 0x79(%rip),%xmm1
Looks like it's using the instruction pointer as the base pointer to access data. I'm guessing the function needs to be aligned to a least a certain boundary. SSE stuff requires aligned data.
Maybe try adding some .align directive for the asm function.

I got it to work with .align 8 above the .LC0 label

Thanks.
Timing 100000 calls...
gcc time: 908727 us
clang time: 3621275 us
Without the whole overhead, it's even a speedup of four times. Not bad.

Not that user you were talking to, but isn't GCC cheating a bit by guessing and utilizing vector extensions to speed things up? If anything, wouldn't this make C faster due to black compiler magic?

Attached: darkmech.png (476x497, 148K)

based Cthroughs win again

Kek, based

>benchmarking based off clock
yikes

blog.regehr.org/archives/330
intel.com/content/www/us/en/embedded/training/ia-32-ia-64-benchmark-code-execution-paper.html

The point is that gcc's output is clearly much faster.

run clang first then gcc

???

Attached: dilate.png (1917x235, 24K)

Well SSE2 is a common feature set of the x86_64 architecture, so it's fair to use SIMD instructions wherever they are allowed and useful, in this case to add four numbers (that don't even have to be fetched from memory, just calculated) at once.

>cniles learn programming from local rustrbro
more news at 11

>LLVM-based language (Rust) produces same results as LLVM-based C compiler (clang)
wow
what a surprise
really had no reason to see this coming
second coming of christ
mind = blown
10/10

That's not it. The retard Cnile wrote an O(n) solution to an O(1) problem.

The point that GCC will use SIMD in more situations still stands if you don't just sum 1..n, but also the contents of an array, modifying OP's example to
#include
unsigned int sum_array(unsigned int *arr, size_t arrlen) {
unsigned int result = 0;
size_t counter = 0;
while (arrlen) {
result += arr[counter];
counter++;
if (counter > arrlen) {
break;
}
}
return result;
}

BUT NOOO
More LoC == fast code

t. OP

Hating GCC is now a fashion statement.

Funny enough, in the equivalent Sepples program GCC will not use SIMD at a lower optimization level, but LLVM will.
godbolt.org/z/XY6jpq

>not using a range-based for loop
absolutely niggerlicious

>more instruction = slower

interesting what the iterator version's output is
godbolt.org/z/DvxgYq
not exactly zero-cost, but still uses imul and shr instead of the naive loop.

that's exactly what it means.

>reddit
>udemy
>gmail
>outlook
>rust
Is this a false flag or are you actually that much of a nigger ?

even using the peanut brain approach in rust also results in optimized O(1)
godbolt.org/z/hfr7Lu

Rust iterators are heavily optimized. I wrote a simple matrix operations library that relied heavily on iterators for some uni project because other rust libraries were clumsy to use for my needs and I was afraid the compiler wouldn't optimize my code well. My library made my project noticeably faster.

Far from true.
> What is loop unrolling + tomasulo algorithm
> what is SIMD

Clang understands your code.

Attached: shitpiler.png (2557x438, 75K)

a few instructions to setup a serial loop that runs for 1000 cycles or a few dozen instructions to do a vectorized loop that runs 30