Will compilers optimize away temporary variables like this...

will compilers optimize away temporary variables like this? does it make the function more inefficient if not (having to create them in the first place)?

Attached: intermediastuff.png (355x132, 3K)

Other urls found in this thread:

godbolt.org
godbolt.org/z/eOFLFP
youtube.com/watch?v=UHv_Jog9Xuc
nullprogram.com/blog/2016/07/25/
godbolt.org/g/26viuZ
quick-bench.com/LErz0MwLFpVP0T2QQLOyMPKy-b4
quick-bench.com/JgdHmOQEYREJmqYwW_y8eGCC-Bc
godbolt.org/z/lAY2hu
twitter.com/AnonBabble

This is a good question for /dpt/.
For normal temporary variable use you're fine. The cases where this doesn't work would be cases you wouldn't expect the compiler to deal with.

godbolt.org
If you enter your code here you get the assembly output in a nice readable form. You'll see that there's no additional stack space allocated so there's essentially no temporaries.
Compilers don't even describe things in a way that would make this a problem for simple operations. They use single static assignment form. Which is essentially what you're doing already here. Giving every calculation it's own unique identifier. It has a bunch of benefits and is easy to optimize. There's value in writing in this style yourself (for your own sake, not the compiler), sometimes when optimizing. But you're not there yet so don't worry about it.

Nice site, I don't understand the difference here though. But the red color makes me think NOT storing any temporary variables is faster

Attached: resultscode.png (1210x456, 54K)

interestingly keeping the parenthesis is faster

Attached: resultscode.png (744x430, 47K)

are you using -O3?

why the fuck doesnt gcc optimize this when compiling? All ive been hearing forever is that compilers are better than humans and premature optimization is evil, yet there is difference even in simple fucking arithmetic

Attached: ash.png (670x478, 341K)

turned it on now:
all I am getting is this

Attached: resultscode.png (567x326, 16K)

Op is a faggot and doesn't know how to use gcc. of course the compiler optimizes it he just didn't turn it on kek

Attached: resultscode.png (436x326, 17K)

And? Of course its going to be different without any opt flags, are you retarded?

Without optimization the compiler basically translates each line one by one to its assembly. So differing programs will have different outputs even if they're functionally equivalent.

Attached: EdMbRYc.png (935x223, 24K)

you writing a latex report bro?
So is it impossible to write the equivalent of those two assembly lines using c?

Attached: resultscode.png (361x326, 16K)

oups meant to say (a+=b)*a but the result is still 11 lines even it compiles to the same with -o3

wtf are you doing user

Its probably not possible to produce only
lea eax, [rdi+rsi]
imul eax, eax

without enabling optimization flags. Because without the flags the compiler will do a lot of manual register loading and stuff when you declare variables, eg. moving from function arguments to the stack that you can see in the other screenshots (mov DWORD PTR [rbp-4], edi).

Sorry I should have been more clear. You should specify that the compiler should optimize. Otherwise you won't know how it can or how it would optimize.
godbolt.org/z/eOFLFP
If you don't tell it to optimize it won't optimize at all. So you don't really know how your code is doing.

But there's also the case to consider where more complexity in the code might keep the compiler from managing optimizations. But for these simple temporaries that's not a concern.

Attached: file.png (955x671, 86K)

is this the compiler doing n(n+1)/2?
why does clang use three instructions when gcc only does 2?

He hasn't specified -O2 in the compiler settings. With -O2
// Type your code here, or load an example.
int foo(int a, int b) {
const int x = a+b;
const int y = x*x;
return y;
}
becomes
foo(int, int):
lea eax, [rdi+rsi]
imul eax, eax
ret
I do believe GCC is O2 by default, so it's easy to assume that websites GCC is also O2 by default, which it turns out is not the case.

OK so what's going on in the compiler? Is this where real programming happens?
If C doesn't even allow you to write it in two lines how is it low-level?

So VS doesn't optimize tail recursion?

Attached: fac.png (1628x573, 45K)

A compiler translates C to machine language. You can't say it's where the programming happens, the things you write is origramming.
The compiler just works in a certain way, which is documented, it does pretty much what you would expect when you turn off optimization, it translates it step by step.

but someone has had to program the compiler to do those optimizations for you right? I am assuming compilers are still being updated even in languages as old as C, so aren't those working on the compiler programmers?

or in this case, do any optimiztion at all since apparently there is no way to get it to compile to 2 lines without using -O flags

good thread op!

btw
>thread about choosing programming lang
>300 replies

>thread about actual programming
>10 replies

fucking normalfags. mods still do nothing.

Attached: 0ef649789736e59e9321c2db3faa8b39349d6eb519a6374f228c5a3edc243018.jpg (1087x1080, 108K)

Ah yes, those people who work on compilers are programmers yes, compilers are programms like any other program.

I thought what you meant is that writing C is not real programming, because the compiler does some additional work.

>why gcc does 2 instructions when clang does 3 instructions
Because they came to different conclusions about how to do it the best way. I don't know who's right, you'd have to benchmark it.
>is this the compiler doing n(n+1)/2?
Exactly. I'm not sure I've seen code where this actually applies but it could.
>premature optimization is evil
People who say this tend to misquote knuth. I suggest you read the original. It's not like it matters what he thinks anyway but the people who misquote him are appealing to authority where there wasn't any authority.

But to pull out the finicky things like avoiding temporary variables for ints before you turn on optimizations in your compiler is a mistake.

>Is this where real programming happens?
Depends on the definition of real programming.
>OK so what's going on in the compiler?
A lot of things.
youtube.com/watch?v=UHv_Jog9Xuc
This is a good talk that brings you from very little prerequsite knowledge to understanding the optimizer steps of a compiler. Not all of them but you get a feel.

C is lower level than other languages like Python, but it's by definition still a higher level language.

If you want total control of the generated machine code you have to write macine code or assembly language.

I did actually mean that, but I will try to reduce the vitriol.

What I am saying is that, if the argument to not do optimizations yourself is that the compiler is better than humans, then what you have done is just being lazy by shifting the work onto the people who write the compiler. So these are the "real" programmers, because they are the last stop before it hits the hardware, so if it is not good here its not good anywhere.

As a side note, assembly programs are merely assembled correct? Or do assembly "compilers" exist that optimize assembly code?

I don't know assembly, but I'm pretty sure what it's doing is adding rdi and rsi together at destination address eax, and then multiplying eax with eax. It's the equivalent to writing

int a = c + b;
a = a * a;

All the other stuff has been recognized as unnecessary and optimized away. C isn't meant to be a close representation of assembly, if you want to learn something like that, learn LLVM IR, low level really just means direct access to memory.
Isn't VS regarded as absolute dogshit by everyone?

>If C doesn't even allow you to write it in two lines how is it low-level?
What user here was saying wasn't that it's not possible period. It's that the compilers don't do it. I'm not an expert on the C standard but I'm pretty sure it doesn't mandate that there's different optimization settings in the compiler.
As for if C is a low level language or not I would say it's not. It was born as a high level language and it still is. It's just that the height has gotten higher with other languages. Some people mischaracterize C as portable assembly. I think that it might have come from people misleading others because they wanted to stop writing assembly and move to C. That said it's now the case that you're unlikely to outperform the compiler if we take development time into account. Writing a big program in asm will have you write worse code than the same program in C most likely. And where it's needed you can do asm still. Or maybe just do intrinsics as an intermediate step.
Of course I'm sure there's some expert asm programmer that would easily outperform most C programmers in a big program but those are few and far between.

The general approach you should have to C/C++ is not that they're lowlevel langauges, but that they expose low level control. You get to fiddle the bits all you want. (though there's caveats to that in C++)
That's why it's got such utility. If you're just a carefree programmer looking for something to go fast by its own accord I would recommend Java, their virtual machine is really good. You'll still have to keep some things in mind to not be horribly slow but they really do a great job for long running programs (that really is the normal case).
It's not an old thread. But yeah of course Jow Forums sucks now with the consumerist threads or political threads.

>that
>tail recursion

>What I am saying is that, if the argument to not do optimizations yourself is that the compiler is better than humans, then what you have done is just being lazy by shifting the work onto the people who write the compiler. So these are the "real" programmers, because they are the last stop before it hits the hardware, so if it is not good here its not good anywhere.
It's sensible division of labor, who do you think will do the best job of these kinds of optimizations, people spending their whole lives working on it optimization, or people writing code oriented at an end goal? Source code being comprehensible is often also more important than it being perfectly optimized.

>int a = c + b;
a = a * a;

eax is also where you put your return value. So there's no moving values about unless that happens outside the function. The declaration you've made here is misleading.

I wouldn't say you are being lazy. A compiler is just a tool just like any other tool. Also optimization is only one part of a compiler, even without optimization a compiler is a complex piece of software, but that is normal. In programming there is always some division of labour, just like you would not write your own etxt editor you don't have to write your compiler.

Another thing to consider is, that those optimizations are very limited. They almost always give you an increase in speed, but the algorithm itself is generally more important.

As an example consider sorting a list. If you use the selection sort algorithm you will have a quadratic complexity O(n2) meaning that for a list with 1000,000 elements you will need 1000,000 * 1000,000 comparisons of numbers.
If you use merge sort you will need O(n * log(n)) comparisons so 1000,000 * 20 comparisons, merge sort will be 5000 times faster. So the algorithm that you write in C code matters a lot.

Thanks for the information, in all fairness I did state I don't know assembly.

I agree with your points, obviously, it just feels like since you have "more" control in the last checkpoint, there is more skill required. As has been poined out in this thread slapping on -O3 is hardly programming, even if the code now is objectively better.

That said, apart from inlining assembly with GCC, which probably has its own headaches, all I have seen that allows you to write assembly as easily as the other language is holyC, but that language is pretty god damn obscure. Are there any better known languages that allow you to essentially code assembly and "low-level" like C at the same time?

It is tail recursion and gcc optimizes it away.
I have read some blog or something that said VS produces the fastest code. Maybe I have to turn some more flags on?

MSVC could do inline assembly where it automatically deducted the input, output and clobber, unlike gcc and clang, but it only worked for x86, not x64

I agree, the algorithm matters. But is writing down an algorithm on a piece of paper programming? Seems like that would be something more like an "architect" or "designer" of code. I.e. you specify the problem, how you want the problem to be solved, but don't actually implement it in any sense.

Is the future then that you just write "pseudo-code" and hand it to a compiler?

Yeah. I was thinking of how to try and write a 'C equivalent' of the asm there but I couldn't figure out a neat way to express that the return register doesn't belong to the function body. I can't. You made a valiant effort. Saying return c + b * (cached(c + b)) would just be confusing.

>It is tail recursion and gcc optimizes it away.
The last thing you are doing is not a call
int fac(int cur, int ret)
{
if (cur

i tried (a+=b)*a
which would work if the thing in the parenthesis is done first
actually gets me the correct number in gcc, i.e.
for a=2, b=3 -> foo(a,b) = 25

That is programming. I see what you are getting at, but programming is not defined as creating machine code, it's defined as writing a program in some programming language. The compiler does not program, it just creates machine code from C code, it only follows rules.

Giving pseudo code to a compiler would require some sort of AI or something. Programming languages at this point must follow a certain structure, called a grammar. Grammars are formaly defined structures, a compiler can not interpret anything else.

>Or do assembly "compilers" exist that optimize assembly code?
Of course there are, any assembler will do that for you.

blessed thread

Attached: bubna.png (587x334, 25K)

I see, you are right. Somehow I remembered the definition of a tail recursion incorrectly.
I would have expected VS to optimize it in that form and gcc does it, but ok, at least I have learnt something.

You multiplied the return value.
int fac(int cur, int ret)
{
if (cur == 1)
return 1;
int ret=fac(n+1);
return ret * n;
}

The examples you posted do it right.

The color coding is not based on performance, in fact, you can change it. It confused me at first too.

checkd

True, I misremembered and then didnt read the whole post because I was pretty sure I was right. I deleted my post.

>Or do assembly "compilers" exist that optimize assembly code?
Actually asm is quite high level aswell. Aside from assemblers optimizing the CPU translates your instructions in the binary to microcode which is then magicked into something even more efficient, possibly runtime dependent but mostly just hardware specific. That way you can write your x86 asm blissfully ignorant of what specific hardware you're gonna run on as long as it supports the instructions you use.

Even if you're writing machine code, it's not like you're actually dictating how the CPU physically performs the individual instructions either, you're always stating what you want done, not how to do it.

what lang is this nigga?

its C(uck).

Thanks!

>assignment mixed with expression
disgusting

Im not talking about manually pushing electrons around by shooting photons on them. I just meant directly using the instruction set but with mnemonics + address calculation, no optimization.

why do you care lol are you going to use C from now on?

Then writing asm without an optimizing assembler is your best bet. But it doesn't seem wise unless you're doing it for the enjoyment of doing it. Optimizers always abide by some limited "as-if" rule. So they're not gonna fuck you up unless they're buggy or you've broken the contract between you and the optimizer/compiler (in C we call this undefined behavior).

Compilers and assemblers are translators, any of them can perform optimizations if you tell them to.

If you want to directly use the instruction set, you're obviously going to have to directly use the instruction set. Languages can mostly compile or run on multiple instruction sets, therefore they can't be 1 for 1 matches of whatever instruction sets they're running on. If you want something that looks and feels like assembly learn LLVM IR.

watching the video he said
"const" is meaningless and never helps.
when you put it on a local variable it doesn't do shit.
in global scope it might help?

The reason you should think the one on the far right is better is because there's less asm. The color coding is just trying (emphasis) to associate C lines with the asm. Hence the move into eax which was avoided in the right version (see how it did an imul eax ecx last, leaving the result in eax vs the move out of the stack address in the other two examples) is colored red because it's where you might feasibly express that a move into the return is made.
Though there's obviously no strict correspondence here 'ret' might aswell be red but it isn't.

Yes. const globals are a different breed. They don't need to be stored in a way that lets them be modified. This also allows the compiler to bake them into the code directly (they're 'constant folded') obviously since it doesn't need to read the variable in the case where you did modify the global. I believe the reason it doesn't work at local scope is because const doesn't actually mean constant to the compiler. It's there to issue errors to the programmer. That's how I remember it.
But there's also this:
nullprogram.com/blog/2016/07/25/
The bit about how modifying a constant value through a non-const reference is undefined makes me think they could gain quite a bit from using const in code. But I don't remember the video well. Maybe it was just about const reference parameters.

When he says const is useless it's almost certainly from a compiler perspective though. If you find yourself avoiding writing bugs by marking things that you want to be constant in your function const it obviously has value.

I store my tilemaps as arrays in global scope, and they can be constant. But since I want to have destructible tiles that refresh once they leave screen, I need to implement a buffer of the visible tiles.

The problem is I have no idea how to keep track of what tiles should be refreshed or not in an efficient manner.

Do i like store the indices from the tilemap that the tile was gotten at in the buffer?

>optimize the variables away
There are no such things as variables on a computer. Read your H&P.

So a few questions first.
>I store my tilemaps as arrays in global scope, and they can be constant.
Are you doing things like collision detection against these directly or do you take them and translate them into something that's a 'running version' of the level?
>tiles that refresh once they leave screen
Refresh = respawn? Or do we want the tiles to remain destroyed.
>I need to implement a buffer of the visible tiles.
You could instead just define a 'window'. Could be as simple as a min and a max for each axis. When you update the window you could have procedure that brings new/old tiles into being in a way that's appropriate (respawn dead tiles when they return maybe).
>Do i like store the indices from the tilemap that the tile was gotten at in the buffer?
Not clear on what the buffer is. Is this the running version of the level?
>entire post
Is this Mario and you have goombas that have walked off screen/died and you want to respawn them if they're not alive right now?

I am doing collision detection using hotspots, or just sampling points from a rectangle, dividing by tile_size etc then fetching the type from the index in the array. Its basically exactly the same as the NES.

Refresh as in, re-load the tile. Destroying it only destroyts it in the buffer, which are the tiles which are drawn.

Its not Mario but close, its a 2D platformer thats NES era, but it will be for PC (i cant do assembly yet). Maybe port it to NES later on if its good.

By buffer I just mean essentially a copy of all the tile IDs which are visible. Then whenever I move i read from the constant tilemap and but the tiles into the buffer.

mov reg, reg -> can be done in the renamer on a few recent cores
lea reg, [reg+reg] -> fast path integer operation on everything since the pentium pro

the lea should be faster even with the renamer trick

actually at the current level, using the library that I am for graphics and input etc. I have to draw all the tiles that should be visible using a double-nested for loop anyway, so I could potentially store where the tile was gotten from (i.e. its index in the tilemap) and compare to what is in the buffer, and so if it is the same do not refresh the tile. However I really only need to check like one column and one row or something like that, so its not efficient and also requires more memory.

heres a graphic that is better than me in english

Attached: till.png (475x330, 7K)

Compilers can do some crazy inlining, check this out: godbolt.org/g/26viuZ

looks liks there is something called dirty rects. that should work

Jeez thats impressive.

what the fuck does all that c++ code even do

[0,1,2,3,4,5,6,7,8,9] * 2 + 110 ?

Why do I feel like I'm trying to read fucking Mandarin whenever I see C++ code?

because its that ugly
5000+ lines of assembly without O2 btw

I ran this:
quick-bench.com/LErz0MwLFpVP0T2QQLOyMPKy-b4
On gcc 8.2 and clang 6.0 at -O3
gcc->5.05
clang->5.17

This was very finicky. Before I did the noinline attribute and asm(""); trick gcc turned out to be 1/6th slower according to this (clang:4.93 gcc:5.95).
quick-bench.com/JgdHmOQEYREJmqYwW_y8eGCC-Bc
Though looking at the disasm clang decided to do something different.

I don't know why this code is supposedly faster now that I made it call this incredibly small function, makes no sense at all to me. I'd really like a second opinion.

Not quite, more like:
sum([0*2+110,1*2+110,2*2+110,3*2+110,4*2+110,5*2+110,6*2+110,7*2+110,8*2+110,9*2+110]) = 1190

But with coroutines and infinite sequences :).

Yes I am, user.

yeahyeah thats obviously what I mean I just find broadcasting so convenient from back in my python days

I'm actually starting to feel this is more alright than most code I see. The usage code is readable. The concepts involved make the library code sortof ok.
But I say it's not good enough.
godbolt.org/z/lAY2hu
Way too fragile. Just look at that explosion though, it's huge. Nobody who writes this should expect fast code. Obviously starts making more sense with actual operations.

what the fuck is it with c++?
it's completely unreadable
but yeh definitely something fucked with clang vs gcc here

wew yeah just adding one more of those gets it going really bad

It's perfectly readable this is just an odd usecase. Using google benchmark.
static void justcall(benchmark::State& state) {
// Code before the loop is not measured
int x=2;
for (auto _ : state) {
x=noop(x,x+1);
benchmark::DoNotOptimize(x);
}
}
BENCHMARK(justcall);

static void justcall(benchmark::State& state) {
Just a declaration of a call that supports the google benchmark framework. it automatically takes enough sample to reach a specific certainty of measurement.
for (auto _ : state) {
I believe they use the iterator to take the rdtsc or something, whatever they do to actually measure.
benchmark::DoNotOptimize(x);
This just ensures that the compiler doesn't optimize away the entire result. x is 'clobbered' in a way that implies to the compiler that this value was written to any memory address in the program and is written to by an unknown process so it can't be certain that the value for x will be what it expects on the next iteration of the loop. But they do it in a way that doesn't impact optimizations beyond that. Is the idea anyway.
BENCHMARK(justcall);
Not sure what this does exactly, but it invokes the benchmark. I think it's some macro magic to insert a function definition so the library can grab it and call it.

Im sorry its just I gave up on C++ a long time ago, went through all zoomer languages, and I am now back to C.

D is so beautiful

I want dlang in compiler explorer. Can someone else bother Godbolt about it?

Is there something like quick-bench.com that works offline and is a more featurefull?
I know you can use quick-bench offline too, but ehh...

How about just use a debugger you gorilla nigger

It's not a good tool for the job.

Nice digits

You could just make a project using Google benchmark and write your code in there. Quick-bench is just a wrapper.

I was hoping for something more practical to be honest. A tool like valgrind or something.

you can try hiring chinese pajeets to do it for u

Are you sure you're not asking for a profiler? Because quick bench is for microbenchmarking. You take a piece of code and isolate it. You find a comparable piece of code (ideally a replacement for the first) and you compare the pieces of code.
When the results come back you've been enlightened by the experiment.

Now quick bench also does a little more than Google benchmark. It runs perf on the executable (it seems) and parses out the cost of the instructions (see the disasm).
I'm thinking maybe you wanted perf.

You couldn't get these results reliably if you just ran a production read program and profiled it though. It'd give you some idea of what's going on but it's not that effective overall. At best you find that one of your zero cost abstractions have some cost you don't much care for.

dunno but compilers are so good at optimization, they can optimize away your memory clean up.
it is a big problem in linux

Sounds like UB.

I think you're right. Learning perf is definitely something will pay off in the long run.
Those command line tools are often painful to learn, but it might be worthy in this case (like it is for gdb).

>big problem linux
i agree linux is one big problem