Is “IF” expensive? | 易学教程

问题

I can\'t, for the life of me, remember what exactly our teacher said that day and I\'m hoping you would probably know.

The module is \"Data Structures and Algorithms\" and he told us something along the lines of:

The if statement is the most expensive [something]. [something] registers [something].

Yes, I do have a horrible memory and I\'m really really sorry, but I\'ve been googling for hours and nothing has come up. Any ideas?

回答1:

At the very lowest level (in the hardware), yes, ifs are expensive. In order to understand why, you have to understand how pipelines work.

The current instruction to be executed is stored in something typically called the instruction pointer (IP) or program counter (PC); these terms are synonymous, but different terms are used with different architectures. For most instructions, the PC of the next instruction is just the current PC plus the length of the current instruction. For most RISC architectures, instructions are all a constant length, so the PC can be incremented by a constant amount. For CISC architectures such as x86, instructions can be variable-length, so the logic that decodes the instruction has to figure out how long the current instruction is to find the location of the next instruction.

For branch instructions, however, the next instruction to be executed is not the next location after the current instruction. Branches are gotos - they tell the processor where the next instruction is. Branches can either be conditional or unconditional, and the target location can be either fixed or computed.

Conditional vs. unconditional is easy to understand - a conditional branch is only taken if a certain condition holds (such as whether one number equals another); if the branch is not taken, control proceeds to the next instruction after the branch like normal. For unconditional branches, the branch is always taken. Conditional branches show up in if statements and the control tests of for and while loops. Unconditional branches show up in infinite loops, function calls, function returns, break and continue statements, the infamous goto statement, and many more (these lists are far from exhaustive).

The branch target is another important issue. Most branches have a fixed branch target - they go to a specific location in code that is fixed at compile time. This includes if statements, loops of all sorts, regular function calls, and many more. Computed branches compute the target of the branch at runtime. This includes switch statements (sometimes), returning from a function, virtual function calls, and function pointer calls.

So what does this all mean for performance? When the processor sees a branch instruction appear in its pipeline, it needs to figure out how to continue to fill up its pipeline. In order to figure out what instructions come after the branch in the program stream, it needs to know two things: (1) if the branch will be taken and (2) the target of the branch. Figuring this out is called branch prediction, and it's a challenging problem. If the processor guesses correctly, the program continues at full speed. If instead the processor guesses incorrectly, it just spent some time computing the wrong thing. It now has to flush its pipeline and reload it with instructions from the correct execution path. Bottom line: a big performance hit.

Thus, the reason why if statements are expensive is due to branch mispredictions. This is only at the lowest level. If you're writing high-level code, you don't need to worry about these details at all. You should only care about this if you're writing extremely performance-critical code in C or assembly. If that is the case, writing branch-free code can often be superior to code that branches, even if several more instructions are needed. There are some cool bit-twiddling tricks you can do to compute things such as abs(), min(), and max() without branching.

回答2:

"Expensive" is a very relative term, especially with relationship to an "if" statement since you also have to take into the account the cost of the condition. That could range anywhere from a few short cpu instructions to testing the result of a function that calls out to a remote database.

I wouldn't worry about it. Unless you're doing embedded programming you probably shouldn't be concerned about the cost of "if" at all. For most programmers it's just not going to ever be the driving factor in your app's performance.

回答3:

Branches, especially on RISC architecture microprocessors, are some of the most expensive instructions. This is because on many architectures, the compiler predicts which path of execution will be taken most likely and puts those instructions next in the executable, so they'll already be in the CPU cache when the branch happens. If the branch goes the other way, it has to go back out to main memory and fetch the new instructions -- that's fairly expensive. On many RISC architectures, all instructions are one cycle except for branch (which is often 2 cycles). We're not talking about a major cost here, so don't worry about it. Also, the compiler will optimize better than you do 99% of the time :) One of the really awesome things about the EPIC architecture (Itanium is an example) is that it caches (and begins processing) instructions from both sides of the branch, then discards the set it doesn't need once the outcome of the branch is known. This saves the extra memory access of a typical architecture in the event that it branches along the unpredicted path.

回答4:

Check out the article Better Performance Through Branch Elimination on Cell Performance. Another fun one is this post about branchless selections on the Real Time Collision Detection Blog.

In addition to the excellent answers already posted in response to this question, I'd like to put in a reminder that although "if" statements are considered expensive low-level operations, trying to utilize branch-free programming techniques in a higher level environment, such as a scripting language or a business logic layer (regardless of language), may be ridiculously inappropriate.

The vast majority of the time, programs should be written for clarity first and optimized for performance second. There are numerous problem domains where performance is paramount, but the simple fact is that most developers are not writing modules for use deep in the core of a rendering engine or a high performance fluid dynamics simulation that runs for weeks on end. When the top priority is for your solution to "just work" the last thing on your mind should be whether or not you can save on the overhead of a conditional statement in your code.

回答5:

On the lowest possible level if consists of (after computing all the app-specific prerequisites for particular if):

some test instruction
jump to some place in the code if test succeeds, proceed forwards otherwise.

Costs associated with that:

a low level comparison -- usually 1 cpu operation, super cheap
potential jump -- which can be expensive

Reson why jumps are expensive:

you can jump to arbirary code that lives anywhere in memory, if it turns out that it is not cached by the cpu -- we have a problem, because we need to access main memory, which is slower
modern CPUs do branch predition. They try to guess whether if will succeed or not and execute code ahead in the pipeline, so speed things up. If the prediction fails all computation done ahead by pipeline has to be invalidated. That also is an expensive operation

So to sum up:

If can be expesive, if you really, really, relly care about performance.
You should care about it if and only if you are writing real time raytracer or biological simulation or something similar. There is no reason to care about it in most of the real world.

回答6:

if in itself is not slow. Slowness is always relative i bet for my life that you haven't ever felt the "overhead" of an if-statement. If you are going to make a high-performance code, you migh want to avoid branches anyway. What makes if slow is that the processor is preloading code from after the if based on some heuristic and whatnot. It will also stop pipelines from executing code directly after the if branch instruction in the machine code, since the processor doesn't know yet what path will be taken (in a pipelined processor, multiple instructions are interleaved and executed). Code executed could have to be executed in reverse (if the other branch was taken. it's called branch misprediction), or noop's be filled at those places so that this doesn't happen.

If if is evil, then switch is evil too, and &&, || too. Don't worry about it.

回答7:

Maybe the branching kills the CPU instruction prefetching?

回答8:

Modern processors have long execution pipelines which means that several instructions are executed in various stages at the same time. They may not always know the outcome of one instruction when the next one begins to run. When they run into a conditional jump (if) they sometimes have to wait until the pipeline is empty before they can know which way the instruction pointer should go.

I think of it as a long freight train. It can carry a lot of cargo fast in a straight line, but it corners badly.

Pentium 4 (Prescott) had a famously long pipeline of 31 stages.

回答9:

The only thing I can imagine this might be referring to is the fact that an if statement generally can result in a branch. Depending on the specifics of the processor architecture, branches can cause pipeline stalls or other less than optimal situations.

However, this is extremely situation specific - most modern processors have branch prediction capabilities that attempt to minimize the negative effects of branching. Another example would be how the ARM architecture (and probably others) can handle conditional logic - the ARM has instruction level conditional execution, so simple conditional logic results in no branching - the instructions simply execute as NOPs if the conditions are not met.

All that said - get your logic correct before worrying about this stuff. Incorrect code is as unoptimized as you can get.

回答10:

As pointed out by many, conditional branches can be very slow on a modern computer.

That being said, there are a whole lot of conditional branches that don't live in if statements, you can't always tell what the compiler will come up with, and worrying about how long basic statements will take is virtually always the wrong thing to do. (If you can tell what the compiler will generate reliably, you may not have a good optimizing compiler.)

回答11:

Also note that inside a loop is not necessarily very expensive.

Modern CPU assumes upon first visit of an if-statement, that the "if-body" is to be taken (or said the other way: it also assumes a loop-body to be taken multiple times) (*). Upon second and further visits, it (the CPU) can maybe look into the Branch History Table, and see how the condition was the last time (was it true? was it false?). If it was false the last time, then speculative execution will proceed to the "else" of the if, or beyond the loop.

(*) The rule is actually "forward branch not taken, backward branch taken". In an if-statement, there is only a [forward] jump (to the point after the if-body) if the condition evaluates to false (remember: the CPU anyways assumes to not take a branch/jump), but in a loop, there is maybe a forward branch to the position after the loop (not to be taken), and a backward branch upon repetetion (to be taken).

This is also one of the reasons why a call to a virtual function or a function-pointer-call is not that worse as many assume (http://phresnel.org/blog/)

回答12:

CPUs are deeply pipelined. Any branch instruction (if/for/while/switch/etc) means that the CPU doesn't really know what instruction to load and run next.

The CPU either stalls while waiting to know what to do, or the CPU takes a guess. In the case of an older CPU, or if the guess is wrong, you'll have to suffer a pipeline stall while it goes and loads the correct instruction. Depending on the CPU this can be as high as 10-20 instructions worth of stall.

Modern CPUs try to avoid this by doing good branch prediction, and by executing multiple paths at the same time, and only keeping the actual one. This helps out a lot, but can only go so far.

Good luck in the class.

Also, if you have to worry about this in real life, you're probably doing OS design, realtime graphics, scientific computing, or something similarly CPU-bound. Profile before worrying.

回答13:

Write your programs the clearest, simplest, cleanest way that isn't obviously inefficient. That makes the best use of the most expensive resource, you. Be it writing or later debugging (requires understanding) the program. If the performance isn't enough, measure where the bottlenecks are, and see how to mitigate them. Only on extremely rare occasions will you have to worry about individual (source) instructions when doing so. Performance is about selecting the right algorithms and data structures in the first line, careful programing, getting a fast enough machine. Use a good compiler, you'd be surprised when seeing the kind of code restructuring a modern compiler does. Restructuring code for performance is a sort of last resort measure, the code gets more complex (thus buggier), harder to modify, and thus all-around more expensive.

回答14:

I had this argument with a friend of mine once. He was using a very naive circle algorithm, but claimed his to be faster than mine (The kind that only calculates 1/8th of the circle) because mine used if. In the end, the if statement was replaced with sqrt and somehow that was faster. Perhaps because the FPU has sqrt built in?

回答15:

Some CPU's(like X86) provides branch prediction to programming level to avoid such a branch prediction latency.

Some compiler exposes(like GCC) these as a extension to higher level programming languages(like C/C++).

Refer likely()/unlikely() macros in the Linux kernel - how do they work? What's their benefit?.

回答16:

The most expensive in terms of ALU usage? It uses up CPU registers to store the values to be compared and takes up time to fetch and compare the values each time the if statement is run.

Therefore an optimization of that is to do one comparison and store the result as a variable before the loop is run.

Just trying to interpret your missing words.

来源：https://stackoverflow.com/questions/315306/is-if-expensive

标签

if-statement

language-agnostic

branch-prediction