Anonymous functions return dynamically allocated values

前端 未结 4 1953
北恋
北恋 2021-01-29 06:12

The question is based on a design pattern solution easily doable in other languages but difficult to implement in C. The narrowed down code is below.

Building on this an

相关标签:
4条回答
  • 2021-01-29 06:13

    (Quoting your accepted answer to yourself)

    Secondly a pointer to a parent struct can't receive a pointer to it's derived type (Embedded parent struct) so I can't do much there. I tried using void * but perhaps a solution might exists using memory address and then access some member of the struct without casting to specific types. I'll ask that in another question.

    This is yet another pointer that one should learn the basics first. The thing you miss is called 'forward declaration':

    struct chicken; // here we tell the compiler that 'struct chicken' is a thing
    struct egg{
      struct chicken *laidby; // while the compiler knows no details about 'struct chicken',
                              // its existence is enough to have pointers for it
    };
    struct chicken{           // and later it has to be declared properly
      struct egg *myeggs;
    };
    

    What I'm missing is the ability to call the super method from the overridden run method in some way?

    These are not methods and there is no override. In your code no OOP happens, C is a procedural programming language. While there are OOP extensions for C, you really should not go for them without knowing C basics.

    0 讨论(0)
  • 2021-01-29 06:15

    The correct order is:

    1. learn C
    2. do magic

    It just will not work in the other way. ({}) does not bend the semantics for you. If your add expects a function which returns struct Super*, it will not work with struct Sub, not even if you put the missing * there.

    This just works on TutorialsPoint:

    #include <stdio.h>
    #include <stdlib.h>
    
    int max(int a,int b){
        if(a>b)
            return a;
        return b;
    }
    
    struct Super{};
    
    void add(struct Super *(*superRef)()) {
        struct Super *(*secretStorage)()=superRef;
        /* ... */
        struct Super *super = secretStorage();
        /* ... */
        free(super);
        printf("Stillalive\n");
    }
    
    int main()
    {
        printf("Hello, World!\n");
    
        int (*myMax)(int,int); // <-- that is a function pointer
    
        myMax=max;             // <-- set with oldschool function
        printf("%d\n",myMax(1,2));
    
        myMax = ({             // <-- set with fancy magic
            int __fn__ (int x, int y) { return x < y ? x : y; }
            __fn__;
        });    
        printf("%d - intentionally wrong\n",myMax(1,2));
    
        add(
            ({
                struct Super* fn(){
                    printf("Iamhere\n");
                    return malloc(sizeof(struct Super));
                }
                fn;}));
        printf("Byfornow\n");
        return 0;
    }
    

    Created a small library project with anonymous magic embedded in anonymous magic and heap allocation. It does not make much sense, but it works:

    testlib.h

    #ifndef TESTLIB_H_
    #define TESTLIB_H_
    
    struct Testruct{
        const char *message;
        void (*printmessage)(const char *message);
    };
    
    extern struct Testruct *(*nonsense())();
    
    #endif
    

    testlib.c

    #include "testlib.h"
    #include <stdio.h>
    #include <stdlib.h>
    
    const char *HELLO="Hello World\n";
    
    struct Testruct *(*nonsense())(){
        return ({
            struct Testruct *magic(){
                struct Testruct *retval=malloc(sizeof(struct Testruct));
                retval->message=HELLO;
                retval->printmessage=({
                    void magic(const char *message){
                        printf(message);
                    }
                    magic;
                });
                return retval;
            }
            magic;
        });
    }
    

    test.c

    #include "testlib.h"
    #include <stdio.h>
    #include <stdlib.h>
    
    int main(){
        struct Testruct *(*factory)()=nonsense();
        printf("Alive\n");
        struct Testruct *stuff=factory();
        printf("Alive\n");
        stuff->printmessage(stuff->message);
        printf("Alive\n");
        free(stuff);
        printf("Alive\n");
        return 0;
    }
    

    I followed the steps in https://www.cprogramming.com/tutorial/shared-libraries-linux-gcc.html for building an running it (practically 3 gcc calls: gcc -c -Wall -Werror -fpic testlib.c, gcc -shared -o libtestlib.so testlib.o, gcc -L. -Wall -o test test.c -ltestlib and a bit of fight with LD_LIBRARY_PATH)

    0 讨论(0)
  • 2021-01-29 06:24

    First community told me that anonymous functions are not part of C, so the alternate suggestion is to use named functions and pointer to it.

    Secondly a pointer to a parent struct can't receive a pointer to it's derived type (Embedded parent struct) so I can't do much there. I tried using void * but perhaps a solution might exists using memory address and then access some member of the struct without casting to specific types. I'll ask that in another question.

    What I'm missing is the ability to call the super method from the overridden run method in some way?

    src/super.h

    struct Super {
        void (*run)();
    };
    
    struct Super *newSuper();
    

    src/super.c

    static void run() {
        printf("Running super struct\n");
    }
    
    struct Super *newSuper() {
        struct Super *super = malloc(sizeof(struct Super));
        super->run = run;
        return super;
    }
    

    src/Runner.h

    struct Runner {
    
        void (*addFactoryMethod)(struct Super *(*ref)());
    
        void (*execute)();
    };
    
    struct Runner *newRunner();
    

    src/runner.c

    struct Super *(*superFactory)();
    
    void addFactoryMethod(struct Super *(*ref)()) {
        superFactory = ref;
    }
    
    static void execute() {
        struct Super *sup = superFactory(); // calling cached factory method
        sup->run();
    }
    
    struct Runner *newRunner() {
        struct Runner *runner = malloc(sizeof(struct Runner));
        runner->addFactoryMethod = addFactoryMethod;
        runner->execute = execute;
        return runner;
    }
    

    test/runner_test.c

    void anotherRunMethod() {
        printf("polymorphism working\n");
        // how can i've the ability to call the overridden super method in here?
    }
    
    struct Super *newAnotherSuper() {
        struct Super *super = malloc(sizeof(struct Super));
        super->run = anotherRunMethod;
        return super;
    }
    
    void testSuper() {
        struct Runner *runner = newRunner();
        runner->addFactoryMethod(&newAnotherSuper);
        runner->execute();
    }
    
    int main() {
        testSuper();
        return 0;
    }
    
    0 讨论(0)
  • 2021-01-29 06:32

    The code shown in the question is not standard C, but the GNU C variant that GCC supports. Unfortunately, there does not seem to be a gnu-c tag, to correctly specify the variant of C involved.

    Furthermore, the use case seems to rely on shoehorning specific type of object-oriented paradigm into a C library interface. This is horrible, because it involves assumptions and features C simply does not have. There is a reason why C (and GNU-C) and C++ and Objective-C are different programming languages.

    The simple answer to "functions returning dynamically allocated values" where the type of the value is opaque to the library, is to use void *, and for function pointers, (void *)(). Note that in POSIX C, void * can also hold a function pointer.

    The more complex answer would describe how libraries like GObject support object-oriented paradigms in C.

    In practice, especially in POSIX C, using a type tag (usually int, but can be any other type) and an union, one can implement polymorphic structures, based on an union of structures with all having that type tag as the same first element. The most common example of such functionality is struct sockaddr.

    Basically, your header file defines one or more structures with the same initial member, for example

    enum {
        MYOBJECT_TYPE_DOUBLE,
        MYOBJECT_TYPE_VOID_FUNCTION,
    };
    
    struct myobject_double {
        int     type;  /* MYOBJECT_TYPE_DOUBLE */
        double  value;
    };
    
    struct myobject_void_function {
        int     type;  /* MYOBJECT_TYPE_VOID_FUNCTION */
        void  (*value)();
    };
    

    and at the end, an union type, or a structure type with an anonymous union (as provided by C11 or GNU-C), of all the structure types,

    struct myobject {
        union {
            struct { int type; };          /* for direct 'type' member access */ 
            struct myobject_double         as_double;
            struct myobject_void_function  as_void_function;
        };
    };
    

    Note that technically, wherever that union is visible, it is valid to cast any pointer of any of those structure types to another of those structure types, and access the type member (see C11 6.5.2.3p6). It is not necessary to use the union at all, it suffices for the union to be defined and visible.

    Still, for ease of maintenance (and to avoid arguments with language lawyer wannabes who did not read that paragraph in the C standard), I do recommend using the structure containing the anonymous union as the "base" type in the library interface.

    For example, the library might provide a function to return the actual size of some object:

    size_t myobject_size(struct myobject *obj)
    {
        if (obj) 
            switch (obj->type) {
            case MYOBJECT_TYPE_DOUBLE:        return sizeof (struct myobject_double);
            case MYOBJECT_TYPE_VOID_FUNCTION: return sizeof (struct myobject_void_function);
            }
        errno = EINVAL;
        return 0;
    }
    

    It seems to me OP is trying to implement a factory pattern, where the library function provides the specification (class in OOP) for the object created, and a method to produce those objects later.

    The only way in C to implement dynamic typing is via the kind of polymorphism I show above. This means that the specification for the future objects (again, class in OOP) must be an ordinary object itself.

    The factory pattern itself is pretty easy to implement in standard C. The library header file contains for example

    #include <stdlib.h>
    
    /*
     * Generic, application-visible stuff
    */
    
    struct any_factory {
    
        /* Function to create an object */
        void *(*produce)(struct any_factory *);
    
        /* Function to discard this factory */
        void  (*retire)(struct any_factory *);
    
        /* Flexible array member; the actual
           size of this structure varies. */
        unsigned long  payload[];
    };
    
    static inline void *factory_produce(struct any_factory *factory)
    {
        if (factory && factory->produce)
            return factory->produce(factory);
    
        /* C has no exceptions, but does have thread-local 'errno'.
           The error codes do vary from system to system. */
        errno = EINVAL;
        return NULL;
    }
    
    static inline void factory_retire(struct any_factory *factory)
    {
        if (factory) {
            if (factory->retire) {
                factory->retire(factory);
            } else {
                /* Optional: Poison function pointers, to easily
                             detect use-after-free bugs. */
                factory->produce = NULL;
                factory->retire = NULL; /* Already NULL, too. */
                /* Free the factory object. */
                free(factory);
            }
        }
    }
    
    /*
     * Library function.
     *
     * This one takes a pointer and size in chars, and returns
     * a factory object that produces dynamically allocated
     * copies of the data.
    */
    
    struct any_factory *mem_factory(const void *, const size_t);
    

    where factory_produce() is a helper function which invokes the factory to produce one object, and factory_retire() retires (discards/frees) the factory itself. Aside from the extra error checking, factory_produce(factory) is equivalent to (factory)->produce(factory), and factory_retire(factory) to (factory)->retire(factory).

    The mem_factory(ptr, len) function is an example of a factory function provided by a library. It creates a factory, that produces dynamically allocated copies of the data seen at the time of the mem_factory() call.

    The library implementation itself would be something along the lines of

    #include <stdlib.h>
    #include <string.h>
    #include <errno.h>
    
    struct mem_factory {
        void *(*produce)(struct any_factory *);
        void  (*retire)(struct any_factory *);
        size_t         size;
        unsigned char  data[];
    };
    
    /* The visibility of this union ensures the initial sequences
       in the structures are compatible; see C11 6.5.2.3p6.
       Essentially, this causes the casts between these structure
       types, for accessing their initial common members, valid. */
    union factory_union {
        struct any_factory  any;
        struct mem_factory  mem;
    };
    
    static void *mem_producer(struct any_factory *any)
    {
        if (any) {
            struct mem_factory *mem = (struct mem_factory *)any;
    
            /* We return a dynamically allocated copy of the data,
               padded with 8 to 15 zeros.. for no reason. */
            const size_t  size = (mem->size | 7) + 9;
            char         *result;
    
            result = malloc(size);
            if (!result) {
                errno = ENOMEM;
                return NULL;
            }
    
            /* Clear the padding. */
            memset(result + size - 16, 0, 16);
    
            /* Copy the data, if any. */
            if (mem->size)
                memcpy(result, mem->data, size);
    
            /* Done. */
            return result;
        }
    
        errno = EINVAL;
        return NULL;
    }
    
    static void mem_retirer(struct any_factory *any)
    {
        if (any) {
            struct mem_factory *mem = (struct mem_factory *)any;
    
            mem->produce = NULL;
            mem->retire  = NULL;
            mem->size    = 0;
            free(mem);
        }
    }
    
    /* The only exported function:
    */
    struct any_factory *mem_factory(const void *src, const size_t len)
    {
        struct mem_factory *mem;
    
        if (len && !src) {
            errno = EINVAL;
            return NULL;
        }
    
        mem = malloc(len + sizeof (struct mem_factory));
        if (!mem) {
            errno = ENOMEM;
            return NULL;
        }
    
        mem->produce = mem_producer;
        mem->retire  = mem_retirer;
        mem->size    = len;
    
        if (len > 0)
            memcpy(mem->data, src, len);
    
        return (struct any_factory *)mem;
    }
    

    Essentially, the struct any_factory type is actually polymorphic (not in the application, but within the library only). All its variants (struct mem_factory here) has the two initial function pointers in common.

    Now, if we examine the code above, and consider the factory pattern, you should realize that the function pointers provide very little of value: you could just use the polymorphic type I showed earlier in this answer, and have the inline producer and consumer functions call subtype-specific internal functions based on the type of the factory. factory.h:

    #ifndef   FACTORY_H
    #define   FACTORY_H
    #include <stdlib.h>
    
    struct factory {
        /* Common member across all factory types */
        const int  type;
    
        /* Flexible array member to stop applications
           from declaring static factories. */
        const unsigned long  data[];
    };
    
    /* Generic producer function */
    void *produce(const struct factory *);
    
    /* Generic factory discard function */
    void retire(struct factory *);
    
    /*
     * Library functions that return factories.
    */
    
    struct factory  *mem_factory(const void *, const size_t);
    
    #endif /* FACTORY_H */
    

    and factory.c:

    #include <stdlib.h>
    #include <string.h>
    #include <errno.h>
    #include "factory.h"
    
    enum {
        INVALID_FACTORY = 0,
    
        /* List of known factory types */
        MEM_FACTORY,
    
        /* 1+(the highest known factory type) */
        NUM_FACTORY_TYPES
    };
    
    struct mem_factory {
        int     type;
        size_t  size;
        char    data[];
    };
    
    /* The visibility of this union ensures the initial sequences
       in the structures are compatible; see C11 6.5.2.3p6.
       Essentially, this causes the casts between these structure
       types, for accessing their initial common members, valid. */
    union all_factories {
        struct factory      factory;
        struct mem_factory  mem_factory;
    };
    
    /* All factories thus far implemented
       are a single structure dynamically
       allocated, which makes retiring simple.
    */
    void retire(struct factory *factory)
    {
        if (factory &&
            factory->type > INVALID_FACTORY &&
            factory->type < NUM_FACTORY_TYPES) {
            /* Poison factory type, to make it easier
               to detect use-after-free bugs. */
            factory->type = INVALID_FACTORY;
            free(factory);
        }
    }
    
    char *mem_producer(struct mem_factory *mem)
    {
        /* As a courtesy for users, return the memory
           padded to a length multiple of 16 chars
           with zeroes. No real reason to do this. */
        const size_t  size = (mem->size | 7) + 9;
        char         *result;   
    
        result = malloc(size);
        if (!result) {
            errno = ENOMEM;
            return NULL;
        }
    
        /* Clear padding. */
        memset(result + size - 16, 0, 16);
    
        /* Copy data, if any. */
        if (mem->size)
            memcpy(result, mem->data, mem->size);
    
        return result;
    }
    
    /* Generic producer function.
       Calls the proper individual producers.
    */
    void *factory_producer(struct factory *factory)
    {
        if (!factory) {
            errno = EINVAL;
            return NULL;
        }
    
        switch (factory->type) {
    
        case mem_factory:
            return mem_producer((struct mem_factory *)factory);
    
        default:
            errno = EINVAL;
            return NULL;
        }
    }
    
    /* Library functions that return factories.
    */
    struct factory *mem_factory(const void *ptr, const size_t len)
    {
        struct mem_factory *mem;
    
        if (!ptr && len > 0) {
            errno = EINVAL;
            return NULL;
        }
    
        mem = malloc(len + sizeof (struct mem_factory));
        if (!mem) {
            errno = ENOMEM;
            return NULL;
        }
    
        mem->type = MEM_FACTORY;
        mem->size = len;
        if (len > 0)
            memcpy(mem->data, ptr, len);
    
        return (struct factory *)mem;
    }
    

    If we look at standard C and POSIX C library implementations, we'll see that both of these approaches are used.

    The standard I/O FILE structure often contains function pointers, and the fopen(), fread(), fwrite(), etc. functions are just wrappers around these. This is especially the case if the C library supports an interface similar to GNU fopencookie().

    POSIX.1 socket, especially the struct sockaddr type, is the original prototype for the polymorphic structure shown first in this answer. Because their interface does not support anything similar to fopencookie() (that is, overriding the implementation of e.g. send(), recv(), read(), write(), close()), there is no need for the function pointers.

    So, please do not ask which one is more suitable, as both are very commonly used, and it very much depends on minute details.. In general, I prefer the one that yields a simpler implementation providing all the necessary functionality.

    I have personally found that it is not that useful to worry about future use cases without practical experience and feedback first. Rather than trying to create the end-all, best-ever framework that solves all future problems, the KISS principle and the Unix philosophy seem to yield much better results.

    0 讨论(0)
提交回复
热议问题