How to prevent the arrowhead anti-pattern

风流意气都作罢 提交于 2019-11-27 19:35:20
Gerrie Schenck

I'd go for the multiple return statements. This makes the code easy to read and understand.

Don't use goto for obvious reasons.

Don't use exceptions because the check you are doing isn't exceptional, it's something you can expect so you should just take that into account. Programming against exceptions is also an anti-pattern.

Consider inverting the null checks to:

var foo = getfoo();
if (foo == null)
{
    return;
}
var bar = getbar(foo);
if (bar == null)
{
    return;
}
...etc

You can chain expressions. An assignment returns the assigned value, so you can check its result. Also, you can use the assigned variable in the next expression.

As soon as an expression returns false, the others are not longer executed, because the entire expression would return false already (because of the and operation).

So, something like this should work:

Foo foo; Bar bar; Moo moo; Cow cow;

if( (foo = getfoo()) != null &&
    (bar = getbar(foo)) != null &&
    (moo = getmoo(bar)) != null &&
    (cow = getcow(moo)) != null )
{
  ..
}

This is one of few scenarios where it is perfectly acceptable (if not desirable) to use goto.

In functions like this, there will often be resources that are allocated or state changes that are made mid-way which need to be undone before the function exits.

The usual problem with return-based solutions (e.g., rexcfnghk's and Gerrie Schenck's) is that you need to remember to undo those state changes before every return. This leads to code duplication and opens the door to subtle errors, especially in larger functions. Do not do this.


CERT actually recommends a structural approach based on goto.

In particular, note their example code taken from copy_process in kernel/fork.c of the Linux kernel. A simplified version of the concept is as follows:

    if (!modify_state1(true))
        goto cleanup_none;
    if (!modify_state2(true))
        goto cleanup_state1;
    if (!modify_state3(true))
        goto cleanup_state2;

    // ...

cleanup_state3:
    modify_state3(false);
cleanup_state2:
    modify_state2(false);
cleanup_state1:
    modify_state1(false);
cleanup_none:
    return;

Essentially, this is just a more readable version of the "arrowhead" code that doesn't use unnecessary indentation or duplicate code. This concept can easily be extended to whatever best suits your situation.


As a final note, especially regarding CERT's first compliant example, I just want to add that, whenever possible, it is simpler to design your code so that the cleanup can be handled all at once. That way, you can write code like this:

    FILE *f1 = null;
    FILE *f2 = null;
    void *mem = null;

    if ((f1 = fopen(FILE1, "r")) == null)
        goto cleanup;
    if ((f2 = fopen(FILE2, "r")) == null)
        goto cleanup;
    if ((mem = malloc(OBJSIZE)) == null)
        goto cleanup;

    // ...

cleanup:
    free(mem); // These functions gracefully exit given null input
    close(f2);
    close(f1);
    return;

First your suggestion (return after each if clause) is quite a good way out:

  // Contract (first check all the input)
  var foo = getfoo();

  if (Object.ReferenceEquals(null, foo))
    return; // <- Or throw exception, put assert etc.

  var bar = getbar(foo);

  if (Object.ReferenceEquals(null, bar))
    return; // <- Or throw exception, put assert etc.

  var moo = getmoo(bar);

  if (Object.ReferenceEquals(null, moo))
    return; // <- Or throw exception, put assert etc.

  // Routine: all instances (foo, bar, moo) are correct (not null) and we can work with them
  ...

The second possibility (in your very case) is to slightly modify your getbar() as well as getmoo() functions such that they return null on null input, so you'll have

  var foo = getfoo();
  var bar = getbar(foo); // return null if foo is null
  var moo = getmoo(bar); // return null if bar is null

  if ((foo == null) || (bar == null) || (moo == null))
    return; // <- Or throw exception, put assert(s) etc.    

  // Routine: all instances (foo, bar, moo) are correct (not null)
  ...

The third possibility is that in complicated cases you can use Null Object Desing Patteren

http://en.wikipedia.org/wiki/Null_Object_pattern

Do it old-school:

var foo;
var bar;
var moo;
var cow;
var failed = false;
failed = failed || (foo = getfoo()) == null;
failed = failed || (bar = getbar(foo)) == null;
failed = failed || (moo = getmoo(bar)) == null;
failed = failed || (cow = getcow(moo)) == null;

Much clearer - no arrow - and extendable forever.

Please do not go over to the Dark Side and use goto or return.

var foo =                        getFoo();
var bar = (foo == null) ? null : getBar(foo);
var moo = (bar == null) ? null : getMoo(bar);
var cow = (moo == null) ? null : getCow(moo);
if (cow != null) {
  ...
}
Jens Schauder

If you can change the stuff you are calling, you can change it to never ever return null, but a NULL-Object instead.

This would allow you to lose all the ifs completely.

An alternative is to use a "fake" single loop for controlling program flow. I can't say I'd recommend it, but it's definitely better looking and more readable than arrowhead.

Adding a "stage", "phase" or sth like that variable may simplify debugging and/or error handling.

int stage = 0;
do { // for break only, possibly with no indent

var foo = getfoo();
if(foo==null) break;

stage = 1;
var bar = getbar(foo);
if(bar==null) break;

stage = 2;
var moo = getmoo(bar);
if(moo==null) break;

stage = 3;
var cow = getcow(moo);

return 0; // end of non-erroreous program flow
}  while (0); // make sure to leave an appropriate comment about the "fake" while

// free resources if necessary
// leave an error message
ERR("error during stage %d", stage);

//return a proper error (based on stage?)
return ERROR;
try
{
  if (getcow(getmoo(getbar(getfoo()))) == null)
  {
    throw new NullPointerException();
  }
catch(NullPointerException ex)
{
  return; //or whatever you want to do when something is null
}

//... rest of the method

This keeps the main logic of the method uncluttered, and has just one exceptional return. Its disadvantages are that it can be slow if the get* methods are slow, and that it is difficult to tell in a debugger which method returned the null value.

Rex Kerr's answer is indeed very nice.
If you can change the code though,Jens Schauder's answer is probably better (Null Object pattern)

If you can make the example more specific you can probably get even more answers
For example ,depending on the "location" of the methods you can have something like:

namespace ConsoleApplication8
{
    using MyLibrary;
    using static MyLibrary.MyHelpers;

    class Foo { }
    class Bar { }
    class Moo { }
    class Cow { }


    internal class Program
    {
        private static void Main(string[] args)
        {
            var cow = getfoo()?.getbar()?.getmoo()?.getcow();
        }
    }
}

namespace MyLibrary
{
    using ConsoleApplication8;
    static class MyExtensions
    {
        public static Cow getcow(this Moo moo) => null;
        public static Moo getmoo(this Bar bar) => null;
        public static Bar getbar(this Foo foo) => null;
    }

    static class MyHelpers
    {
        public static Foo getfoo() => null;
    }
}

Strange, nobody mentioned method chaining.

If you create once a method chaining class

Public Class Chainer(Of R)

    Public ReadOnly Result As R

    Private Sub New(Result As R)
        Me.Result = Result
    End Sub

    Public Shared Function Create() As Chainer(Of R)
        Return New Chainer(Of R)(Nothing)
    End Function

    Public Function Chain(Of S)(Method As Func(Of S)) As Chainer(Of S)
        Return New Chainer(Of S)(Method())
    End Function

    Public Function Chain(Of S)(Method As Func(Of R, S)) As Chainer(Of S)
        Return New Chainer(Of S)(If(Result Is Nothing, Nothing, Method(Result)))
    End Function
End Class

You can use it everywhere to compose any number of functions into an execution sequence to produce a Result or a Nothing (Null)

Dim Cow = Chainer(Of Object).Create.
    Chain(Function() GetFoo()).
    Chain(Function(Foo) GetBar(Foo)).
    Chain(Function(Bar) GetMoo(Bar)).
    Chain(Function(Moo) GetCow(Moo)).
    Result

This is the one case where I'd use goto.

Your example might not be quite enough to push me over the edge, and multiple returns are better if your method is simple enough. But this pattern can get rather extensive, and you often need some cleanup code at the end. While using most of the other answers here if I can, often the only legible solution is to use goto's.

(When you do, be sure to put all references to the label inside one block so anyone looking at the code knows both the goto's and the the variables are confined to that part of the code.)

In Javascript and Java you can do this:

bigIf:  {

    if (!something)      break bigIf;
    if (!somethingelse)  break bigIf;
    if (!otherthing)     break bigIf;

    // Conditionally do something...
}
// Always do something else...
return;

Javascript and Java have no goto's, which leads me to believe other people have noticed that in this situation you do need them.

An exception would work for me too, except for the try/catch mess you force on the calling code. Also, C# puts in a stack trace on the throw, which will slow your code way down, particularly if it usually kicks out on the first check.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!