Is it a good idea to use varargs in a C API to set key value pairs?

前端 未结 7 2633
情书的邮戳
情书的邮戳 2021-02-20 11:47

I am writing an API that updates a LOT of different fields in a structure.

I could help the addition of future fields by making the update function variadic:

<         


        
相关标签:
7条回答
  • 2021-02-20 12:23

    I'd take a long hard look at any "update" functionality designed to be used externally (or even internally) which uses the same function to update many different fields in a structure. Is there a specific reason why you can't have discrete functionality for updating the fields?

    0 讨论(0)
  • 2021-02-20 12:23

    The reasons given thus far for avoiding varargs are all good. Let me add another one that was not given yet, as it is less important, but can be encountered. The vararg mandates that the parameter be passed on the stack, thus slowing the function call. On some architecture the difference can be significative. On x86 it's not very important because of its lack of register, on SPARC, for example, it can be important. Up to 5 parameters are passed on registers and if your function uses few locals, no stack adjustment is made. If your function is a leaf function (i.e. doesn't call another function), there is no window adjustment also. The cost of the call is therefor very small. With a vararg, the normal sequence of passing parameters on the stack, stack adjustement and window management is made or your function wouldn't be able to get at the parameters. This increases the cost of the call significantly.

    0 讨论(0)
  • 2021-02-20 12:25

    Generally, no.

    Varargs throws out a lot of type-safety - you could pass pointers, floats, etc., instead of ints and it will compile without issue. Misuse of varargs, such as omitting arguments, can introduce odd crashes due to stack corruption, or reading invalid pointers.

    For instance, the following call will compile and result in crashes or other odd behavior:

    UpdateField(6, "Field1", 7, "Field2", "Foo");
    

    The initial 6 is how many parameters to expect. It will convert the string pointer "Foo" to an int to put in Field2, and it will try to read and interpret two other parameters that aren't present, which will probably cause a crash here from dereferencing stack noise.

    I believe the implementation of varargs in C is a mistake (given today's environment - it probably made perfect sense in 1972.) The implementation is you pass a bunch of values on the stack and then the callee will walk the stack picking up parameters, based on its interpretation of some initial control parameter. This type of implementation basically screams for you to make a mistake in what might be a very difficult to diagnose way. C#'s implementation of this, passing an array of objects with an attribute on the method, is just must saner, albeit not directly mappable into the C language.

    0 讨论(0)
  • 2021-02-20 12:28

    One problem with varargs in C is that you don't know how many arguments are passed, so you need that as another parameter:

    update(2, FIELD_NAME1, 10, FIELD_NAME2, 20);
    
    update(3, FIELD_NAME1, 10, FIELD_NAME2, 20, FIELD_NAME3, 30);
    
    0 讨论(0)
  • 2021-02-20 12:29

    Why not have one arg, an array. Even better, a pointer to an array.

    struct field {
      int val;
      char* name;
    };
    

    or even...

    union datatype {
      int a;
      char b;
      double c;
      float f;
    // etc;
    }; 
    

    then

    struct field {
      datatype val;
      char* name;
    };
    
    union (struct* field_name_val_pairs, int len);
    

    ok 2 args. I lied, and thought a length param would be good.

    0 讨论(0)
  • I tend to avoid varargs except in one specific circumstance where it's very useful. Variable arguments don't really deliver all that much benefit over and above what can be done by individual function calls, especially in your case.

    In terms of readability (and that's usually what I prefer over raw speed except for very specific cases), there's no real difference between the following two options (I've added a count to the varargs versions since you need either a count or sentinel to detect the end of the data):

    update(2, FIELD_NAME1, 10, FIELD_NAME2, 20);
    update(3, FIELD_NAME3, 10, FIELD_NAME4, 20, FIELD_NAME5, 30);
    /* ========== */
    update(FIELD_NAME1, 10);
    update(FIELD_NAME2, 20);
    update(FIELD_NAME3, 10);
    update(FIELD_NAME4, 20);
    update(FIELD_NAME5, 30);
    

    In fact, as the varargs version gets longer, you'll need to split it up anyway, for formatting:

    update(5,
        FIELD_NAME1, 10,
        FIELD_NAME2, 20,
        FIELD_NAME3, 10,
        FIELD_NAME4, 20,
        FIELD_NAME5, 30);
    

    Doing it the "one call per field name" way results in simpler code in the function itself, and does not degrade the readability of the calls. In addition, it allows the compiler to properly detect certain errors that it cannot do for varargs, such as incorrect types or a mismatch between the user-supplied count and the actual count.

    If you really must be able to call a single function to do it, I'd opt for:

    void update (char *k1, int v1) {
        ...
    }
    void update2 (char *k1, int v1, char *k2, int v2) {
        update (k1, v1);
        update (k2, v2);
    }
    void update3 (char *k1, int v1, char *k2, int v2, char *k3, int v3) {
        update (k1, v1); /* or these two could be a single */
        update (k2, v2); /*   update2 (k1, v1, k2, v2);    */
        update (k3, v3);
    }
    /* and so on. */
    

    You could even do the higher-level functions as macros if you prefer, without losing type safety.

    The only place I tend to use varargs functions is when providing the same functionality as printf() - for example, I've occasionally had to write logging libraries with functions such as logPrintf() providing the same functionality. I can't think of any other time in my long (and I mean, long :-) time at the coalface that I've needed to use it.

    As an aside, if you do decide to use varargs, I tend to go for sentinels rather than counts since this prevents mismatches when adding fields. You could easily forget to adjust the count and end up with:

    update (2, k1, v1, k2, v2, k3, v3);
    

    when adding, which is insidious because it silently skips k3/v3, or:

    update (3, k1, v1, k2, v2);
    

    when deleting, which is almost certainly fatal to the successful running of your program.

    Having a sentinel prevents this (as long as you don't forget the sentinel, of course):

    update (k1, v1, k2, v2, k3, v3, NULL);
    
    0 讨论(0)
提交回复
热议问题