After reading this article on thedailywtf.com, I\'m not sure that I really got the joke.
It says there that some guy changed the code from
int function()
I can't get different behavior out of this. I tried it with LLVM: I had to add just a little bit of cruft at the return value so that LLVM doesn't optimize anything away, but the generated code for wtf
and wtf2
are totally identical. This wtf is BAAAAAD
Input
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int wtf(int X) {
int x;
char data_string[15];
x = 2;
strcpy(data_string,"data data data");
return 5*X+x+ data_string[X];
}
int wtf2(int X) {
int x = 2;
char data_string[15]="data data data";
return 5*X+x+ data_string[X];
}
int main(int argc, char **argv) {
printf("%d\n", wtf(atoi(argv[1]))+wtf2(atoi(argv[1])));
}
Output:
; ModuleID = '/tmp/webcompile/_3856_0.bc'
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32"
target triple = "i386-pc-linux-gnu"
@.str = internal constant [15 x i8] c"data data data\00" ; <[15 x i8]*> [#uses=3]
@.str1 = internal constant [4 x i8] c"%d\0A\00" ; <[4 x i8]*> [#uses=1]
define i32 @wtf(i32 %X) nounwind readnone {
entry:
%0 = mul i32 %X, 5 ; <i32> [#uses=1]
%1 = getelementptr [15 x i8]* @.str, i32 0, i32 %X ; <i8*> [#uses=1]
%2 = load i8* %1, align 1 ; <i8> [#uses=1]
%3 = sext i8 %2 to i32 ; <i32> [#uses=1]
%4 = add i32 %0, 2 ; <i32> [#uses=1]
%5 = add i32 %4, %3 ; <i32> [#uses=1]
ret i32 %5
}
define i32 @wtf2(i32 %X) nounwind readnone {
entry:
%0 = mul i32 %X, 5 ; <i32> [#uses=1]
%1 = getelementptr [15 x i8]* @.str, i32 0, i32 %X ; <i8*> [#uses=1]
%2 = load i8* %1, align 1 ; <i8> [#uses=1]
%3 = sext i8 %2 to i32 ; <i32> [#uses=1]
%4 = add i32 %0, 2 ; <i32> [#uses=1]
%5 = add i32 %4, %3 ; <i32> [#uses=1]
ret i32 %5
}
define i32 @main(i32 %argc, i8** nocapture %argv) nounwind {
entry:
%0 = getelementptr i8** %argv, i32 1 ; <i8**> [#uses=1]
%1 = load i8** %0, align 4 ; <i8*> [#uses=1]
%2 = tail call i32 @atoi(i8* %1) nounwind readonly ; <i32> [#uses=2]
%3 = getelementptr [15 x i8]* @.str, i32 0, i32 %2 ; <i8*> [#uses=1]
%4 = load i8* %3, align 1 ; <i8> [#uses=1]
%5 = sext i8 %4 to i32 ; <i32> [#uses=1]
%tmp2 = mul i32 %2, 10 ; <i32> [#uses=1]
%6 = shl i32 %5, 1 ; <i32> [#uses=1]
%7 = add i32 %6, 4 ; <i32> [#uses=1]
%8 = add i32 %7, %tmp2 ; <i32> [#uses=1]
%9 = tail call i32 (i8*, ...)* @printf(i8* noalias getelementptr ([4 x i8]* @.str1, i32 0, i32 0), i32 %8) nounwind ; <i32> [#uses=0]
ret i32 undef
}
declare i32 @atoi(i8*) nounwind readonly
declare i32 @printf(i8*, ...) nounwind
Um, re read the article :)
The real WTF was that he touched the entire solution with these kinds of changes when he was supposed to fix a memory leak.
Also doing a change like this wouldn't matter much except potentially breaking/introducing bugs in other, maybe more complicated files, than the example one.
Depending on the compiler and compiler options, initialization like this
char data_string[15] = "data data data";
results in a lot of move instructions to copy the literal data to stack.
Calling strcpy
requires less instructions.
Doing this kind of thing all over a large codebase can increase the binary size significantly.
And of course, he was not spending his time on adding any value.
At first glance the two forms are equivalent. The second one does look nicer but they do the same thing.
But then I read the cited page.
The problem is that the new guy churned the source tree, lots of it. It's bad form to troll through a giant source tree and make a meaningless change. Sure, perhaps one style is slightly better than another, but in practice, it should be a whole lot better before putting 1,000 deltas into a source code control system for people to wade through for eternity is justified.
I suspect that this was a source release, or some other unmentioned complexity caused the editing of that many files to expand their distribution. The contributions to that site are edited quite a bit, but basically the issue is understandable without specifics.
One of the problems with editing a zillion files for a style change is that the chance of an inadvertent error increases. This chance is greatly multiplied when a junior developer does it. Even for the experienced, there is Murphy's law to consider. If it happens right before a release it really is a hanging offense.
Yes, the second code is cleaner, but depending on the compiler it can lead to emitting more machine code. This is entirely compiler-dependent, but the point of the WTF article is that in the second case the compiler would allocate a copy of string/integer value for each code snippet like that and in the first case it would do so only once per program.
2nd code is indeed "cleaner", but with a project of the size that the article is about, it is ridiculous to think refactoring like that is at best useless, at worst error prone.
However this kind of refactoring doesn't inflate an .Exe size form 1 to 2 cds