Why does this LLVM IR code produce unexpected results?

问题

I'm getting really frustrated, since this problem has been bugging me for days, so I'd appreciate every help possible.

I'm currently making my own programming language and am currently trying to implement enums and match statements that match a value to an enum case and runs the corresponding statement but I'm getting unexpected results and segfaults here and there.

Here's one piece of code of my language that runs (lli) but produces unexpected results sometimes (prints 1, not 3 for some reason):

class Node {
    fld value: int;
    fld next: OptionalNode;

    new(_value: int, _next: OptionalNode) {
        value = _value;
        next = _next;
    }
}

enum OptionalNode {
    val nil;
    val some(Node);
}

fun main(): int {
    var s: OptionalNode = OptionalNode.some(new Node(3, OptionalNode.nil));

    match s {
        OptionalNode.some(n) => print n.value;
    }

    var r: int = 0;

    ret r;
}

This is the corresponding LLVM IR that my compiler generates:

; ModuleID = 'test.bc'
source_filename = "test"

%test.Node = type { i32, %test.OptionalNode }
%test.OptionalNode = type { i8, [8 x i8] }
%test.OptionalNode.nil = type { i8 }
%test.OptionalNode.some = type { i8, %test.Node* }

@str = private unnamed_addr constant [4 x i8] c"%d\0A\00", align 1

declare i32 @printf(i8*, ...)

define void @"test.Node.!ctor$[test.Node]i[test.OptionalNode]"(%test.Node* %this, i32 %_value, %test.OptionalNode %_next) {
entry:
  %arg0 = alloca %test.Node*, align 8
  store %test.Node* %this, %test.Node** %arg0
  %arg1 = alloca i32, align 4
  store i32 %_value, i32* %arg1
  %arg2 = alloca %test.OptionalNode, align 16
  store %test.OptionalNode %_next, %test.OptionalNode* %arg2
  %ldarg1 = load i32, i32* %arg1
  %tmpld_cls = load %test.Node*, %test.Node** %arg0
  %tmpfld = getelementptr inbounds %test.Node, %test.Node* %tmpld_cls, i32 0, i32 0
  store i32 %ldarg1, i32* %tmpfld
  %ldarg2 = load %test.OptionalNode, %test.OptionalNode* %arg2
  %tmpld_cls1 = load %test.Node*, %test.Node** %arg0
  %tmpfld2 = getelementptr inbounds %test.Node, %test.Node* %tmpld_cls1, i32 0, i32 1
  store %test.OptionalNode %ldarg2, %test.OptionalNode* %tmpfld2
  ret void
}

define i32 @"test.main$v"() {
entry:
  %s = alloca %test.OptionalNode, align 16
  %enm = alloca %test.OptionalNode
  %0 = bitcast %test.OptionalNode* %enm to %test.OptionalNode.nil*
  %1 = getelementptr inbounds %test.OptionalNode.nil, %test.OptionalNode.nil* %0, i32 0, i32 0
  store i8 0, i8* %1
  %2 = load %test.OptionalNode, %test.OptionalNode* %enm
  %tmpalloc = alloca %test.Node
  call void @"test.Node.!ctor$[test.Node]i[test.OptionalNode]"(%test.Node* %tmpalloc, i32 3, %test.OptionalNode %2)
  %enm1 = alloca %test.OptionalNode
  %3 = bitcast %test.OptionalNode* %enm1 to %test.OptionalNode.some*
  %4 = getelementptr inbounds %test.OptionalNode.some, %test.OptionalNode.some* %3, i32 0, i32 0
  store i8 1, i8* %4
  %5 = getelementptr inbounds %test.OptionalNode.some, %test.OptionalNode.some* %3, i32 0, i32 1
  store %test.Node* %tmpalloc, %test.Node** %5
  %6 = load %test.OptionalNode, %test.OptionalNode* %enm1
  store %test.OptionalNode %6, %test.OptionalNode* %s
  %7 = getelementptr inbounds %test.OptionalNode, %test.OptionalNode* %s, i32 0, i32 0
  %8 = load i8, i8* %7
  switch i8 %8, label %match_end [
    i8 1, label %case1
  ]

case1:                                            ; preds = %entry
  %n = alloca %test.Node*, align 8
  %9 = bitcast %test.OptionalNode* %s to %test.OptionalNode.some*
  %10 = getelementptr inbounds %test.OptionalNode.some, %test.OptionalNode.some* %9, i32 0, i32 1
  %11 = load %test.Node*, %test.Node** %10
  store %test.Node* %11, %test.Node** %n
  %tmpld_cls = load %test.Node*, %test.Node** %n
  %tmpgetfldgep = getelementptr inbounds %test.Node, %test.Node* %tmpld_cls, i32 0, i32 0
  %tmpgetfldld = load i32, i32* %tmpgetfldgep
  %print_i = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @str, i32 0, i32 0), i32 %tmpgetfldld)
  br label %match_end

match_end:                                        ; preds = %case1, %entry
  %r = alloca i32, align 4
  store i32 0, i32* %r
  %tmploadlocal = load i32, i32* %r
  ret i32 %tmploadlocal
}

define i32 @main() {
entry:
  %call = tail call i32 @"test.main$v"()
  ret i32 %call
}

Now, as I said this compiles and runs completely, but for some reason it sometimes prints 1 instead 3 which I don't understand at all. I have no idea how to debug llvm ir code and applying the debugify pass with opt produces wrong source lines (all varying offset) which also makes NO SENSE (I'm using llvm 8 btw but llvm 6.0.1 which i was using before showed the same results).

Then, if I move the definition of the r variable up before the match statement, suddenly I get a segfault the position of which I cannot pinpoint because of the offset ir source lines that I mentioned before.

Here's the corresponding code and ir for that:

class Node {
    fld value: int;
    fld next: OptionalNode;

    new(_value: int, _next: OptionalNode) {
        value = _value;
        next = _next;
    }
}

enum OptionalNode {
    val nil;
    val some(Node);
}

fun main(): int {
    var s: OptionalNode = OptionalNode.some(new Node(3, OptionalNode.nil));

    var r: int = 0;

    match s {
        OptionalNode.some(n) => print n.value;
    }

    ret r;
}

; ModuleID = 'test.bc'
source_filename = "test"

%test.Node = type { i32, %test.OptionalNode }
%test.OptionalNode = type { i8, [8 x i8] }
%test.OptionalNode.nil = type { i8 }
%test.OptionalNode.some = type { i8, %test.Node* }

@str = private unnamed_addr constant [4 x i8] c"%d\0A\00", align 1

declare i32 @printf(i8*, ...)

define void @"test.Node.!ctor$[test.Node]i[test.OptionalNode]"(%test.Node* %this, i32 %_value, %test.OptionalNode %_next) {
entry:
  %arg0 = alloca %test.Node*, align 8
  store %test.Node* %this, %test.Node** %arg0
  %arg1 = alloca i32, align 4
  store i32 %_value, i32* %arg1
  %arg2 = alloca %test.OptionalNode, align 16
  store %test.OptionalNode %_next, %test.OptionalNode* %arg2
  %ldarg1 = load i32, i32* %arg1
  %tmpld_cls = load %test.Node*, %test.Node** %arg0
  %tmpfld = getelementptr inbounds %test.Node, %test.Node* %tmpld_cls, i32 0, i32 0
  store i32 %ldarg1, i32* %tmpfld
  %ldarg2 = load %test.OptionalNode, %test.OptionalNode* %arg2
  %tmpld_cls1 = load %test.Node*, %test.Node** %arg0
  %tmpfld2 = getelementptr inbounds %test.Node, %test.Node* %tmpld_cls1, i32 0, i32 1
  store %test.OptionalNode %ldarg2, %test.OptionalNode* %tmpfld2
  ret void
}

define i32 @"test.main$v"() {
entry:
  %s = alloca %test.OptionalNode, align 16
  %enm = alloca %test.OptionalNode
  %0 = bitcast %test.OptionalNode* %enm to %test.OptionalNode.nil*
  %1 = getelementptr inbounds %test.OptionalNode.nil, %test.OptionalNode.nil* %0, i32 0, i32 0
  store i8 0, i8* %1
  %2 = load %test.OptionalNode, %test.OptionalNode* %enm
  %tmpalloc = alloca %test.Node
  call void @"test.Node.!ctor$[test.Node]i[test.OptionalNode]"(%test.Node* %tmpalloc, i32 3, %test.OptionalNode %2)
  %enm1 = alloca %test.OptionalNode
  %3 = bitcast %test.OptionalNode* %enm1 to %test.OptionalNode.some*
  %4 = getelementptr inbounds %test.OptionalNode.some, %test.OptionalNode.some* %3, i32 0, i32 0
  store i8 1, i8* %4
  %5 = getelementptr inbounds %test.OptionalNode.some, %test.OptionalNode.some* %3, i32 0, i32 1
  store %test.Node* %tmpalloc, %test.Node** %5
  %6 = load %test.OptionalNode, %test.OptionalNode* %enm1
  store %test.OptionalNode %6, %test.OptionalNode* %s
  %r = alloca i32, align 4
  store i32 0, i32* %r
  %7 = getelementptr inbounds %test.OptionalNode, %test.OptionalNode* %s, i32 0, i32 0
  %8 = load i8, i8* %7
  switch i8 %8, label %match_end [
    i8 1, label %case1
  ]

case1:                                            ; preds = %entry
  %n = alloca %test.Node*, align 8
  %9 = bitcast %test.OptionalNode* %s to %test.OptionalNode.some*
  %10 = getelementptr inbounds %test.OptionalNode.some, %test.OptionalNode.some* %9, i32 0, i32 1
  %11 = load %test.Node*, %test.Node** %10
  store %test.Node* %11, %test.Node** %n
  %tmpld_cls = load %test.Node*, %test.Node** %n
  %tmpgetfldgep = getelementptr inbounds %test.Node, %test.Node* %tmpld_cls, i32 0, i32 0
  %tmpgetfldld = load i32, i32* %tmpgetfldgep
  %print_i = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @str, i32 0, i32 0), i32 %tmpgetfldld)
  br label %match_end

match_end:                                        ; preds = %case1, %entry
  %tmploadlocal = load i32, i32* %r
  ret i32 %tmploadlocal
}

define i32 @main() {
entry:
  %call = tail call i32 @"test.main$v"()
  ret i32 %call
}

I know that these kinds of questions are really bad and I'm probably breaking some rules by just throwing my code into here but if someone would sacrifice some of their time to help some really frustrated and close to giving up guy, I'd be really grateful.

回答1:

It certainly looks tricky. I think I have the answer to your question.

The segfault is caused when you try to print %tmpgetfldld. If you try compiling with clang and then execute it, you will not get a segfault. That is not to say that it is lli's fault because even this way you will get a wrong output, because you are accessing invalid memory space. Let me explain how this happens.

%tmpgetfldld (which is invalid) is an i32, which originallly was extracted from the memory address pointed by %n, 3 lines above:

%tmpld_cls = load %test.Node*, %test.Node** %n

If the value of %tmpgetfldld is invalid, then it means that %11, who was stored to %n is invalid. The reason is this instruction:

%9 = bitcast %test.OptionalNode* %s to %test.OptionalNode.some*

At the beggining of your program you have allocated to pointer %s size equal to the size of a %test.OptionalNode object, which is 9 bytes (1 byte and another 8 bytes for the array). Then you assign to register %9 the bitcast of %s to type %test.OptionalNode.some. This type has a total size of 1 + 4 + 1 + 8*1 = 14 bytes. At this point of your program nothing is wrong yet and %9 points to the same address %s did, but you only treat it as a %test.OptionalNode.some. However, at that memory space you had allocated 9 bytes and now through 'getelementptr' or 'extractvalue' instructions you have access to 14 bytes. Reading after the 9th byte, would cause a segfault. Indeed, through these instructions:

%10 = getelementptr inbounds %test.OptionalNode.some, %test.OptionalNode.some* %9, i32 0, i32 1
%11 = load %test.Node*, %test.Node** %10

You get a pointer pointing to bytes 1 to 13 (counting from index 0). This pointer is then stored below and loaded again, and you will get the segfault only when you try to access the value, which happens when accessing %tmpgetfldld.

To solve the segfault, you need to warn the compiler somehow that when allocating %s, or any other %test.OptionalNode, you have at least 9 bytes, but you could expect to need more, if for example you bitcasted to a struct with bigger size. Actually this is exactly how LLVM treats virtual classes and polymorphism, when subclasses have variable sized members, but still have to be somehow bitcasted to the parent class. So if you change your %test.OptionalNode struct declaration to this, you solve the segfault:

%test.OptionalNode = type { i8, [8 x i8], i8(...)** }

The last type is a function pointer indicating that you expect variable number of i8 (bytes). Check also here: LLVM what does i32 (...)** mean in type define?

If you do this change, you get rid of the segfault, but you will notice that you haven't fully solved your problem. Sometimes you might get a 3 as an output, sometimes something else, sort of an undefined behavior. This is because, even though you declared an i8(...)** to explain the extra bytes of the bitcasted struct type, the data types that are in common memory areas between the two struct types, are not well aligned. You will notice that their difference starts on the second byte: In %test.OptionalNode an i8 array starts, whereas in %test.OptionalNode.some, there is an i32, then an i8, and then the same array of i8. To solve this you have to change your struct definitions to either this:

%test.OptionalNode = type { i8, [8 x i8], i8(...)** }
%test.OptionalNode.some = type { i8, [8 x i8], %test.Node* }

Or that:

%test.OptionalNode = type { i8, i8(...)** }
%test.OptionalNode.some = type { i8, %test.Node* }

Depends on whether you need that [8 x i8] array or not on your design. Now, your output is consistently 3 and your problem is gone. I think this solution covers your previous question as well (How to fix segmentation fault in genrated llvm bytecode?).

Sorry for the long answer. I hope it helps.

来源：https://stackoverflow.com/questions/55328463/why-does-this-llvm-ir-code-produce-unexpected-results

标签

compiler-construction

llvm

llvm-ir