Awk doesn't match all match all my entries

后端 未结 2 588
爱一瞬间的悲伤
爱一瞬间的悲伤 2021-01-22 19:40

I\'m trying to make \"a script\" - essentially an awk command - to extract the prototypes of functions of C code in a .c file to generate automatically a header .h. I\'m new wit

2条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-01-22 20:06

    Note: the question has changed substantially since I wrote this answer.

    Replace [:space:] with [[:space:]]:

    $ awk '/^[a-zA-Z*_]+[[:space:]]+[a-zA-Z*_]+[[:space:]]*[(].*?[)]/{ print $0 }' dict3.c
    dictent_t* dictentcreate(const char * key, const char * val)  
    dict_t* dictcreate() 
    void dictdestroy(*dict_t d) 
    void dictdump(dict_t *d) 
    int dictlook(dict_t *d, const char * key) 
    int dictget(char* s, dict_t *d, const char *key)
    dict_t* dictadd(dict_t* d, const char * key, const char * val)
    dict_t dictup(dict_t d, const char * key, const char *newval) 
    dict_t* dictrm(dict_t* d, const char * key)
    

    The reason is that [:space:] will match any of the characters :, s, p, a, c, or e. This is not what you want.

    You want [[:space:]] which will match any whitespace.

    Sun/Solaris

    The native Sun/Solaris awk is notoriously bug-filled. If you are on that platform, try nawk or /usr/xpg4/bin/awk or /usr/xpg6/bin/awk.

    Using sed

    A very similar approach can be used with sed. This uses a regex based on yours:

    $ sed -n '/^[a-zA-Z_*]\+[ \t]\+[a-zA-Z*]\+ *[(]/p' dict3.c
    dictent_t* dictentcreate(const char * key, const char * val)  
    dict_t* dictcreate() 
    void dictdestroy(*dict_t d) 
    void dictdump(dict_t *d) 
    int dictlook(dict_t *d, const char * key) 
    int dictget(char* s, dict_t *d, const char *key)
    dict_t* dictadd(dict_t* d, const char * key, const char * val)
    dict_t dictup(dict_t d, const char * key, const char *newval) 
    dict_t* dictrm(dict_t* d, const char * key)
    

    The -n option tells sed not to print unless we explicitly ask it to. The construct /.../p tells sed to print the line if the regex inside the slashes is matched.

    All the improvements to the regex suggested by Ed Morton apply here also.

    Using perl

    The above can also be adopted to perl:

    perl -ne  'print if /^[a-zA-Z_*]+[ \t]+[a-zA-Z*]+ *[(]/' dict3.c
    

提交回复
热议问题