Awk doesn't match all match all my entries

后端未结

关注

 2  588

爱一瞬间的悲伤 2021-01-22 19:40

I\'m trying to make \"a script\" - essentially an awk command - to extract the prototypes of functions of C code in a .c file to generate automatically a header .h. I\'m new wit

2条回答

慢半拍i (楼主)

2021-01-22 20:06

Note: the question has changed substantially since I wrote this answer.

Replace [:space:] with [[:space:]]:

$ awk '/^[a-zA-Z*_]+[[:space:]]+[a-zA-Z*_]+[[:space:]]*[(].*?[)]/{ print $0 }' dict3.c
dictent_t* dictentcreate(const char * key, const char * val)  
dict_t* dictcreate() 
void dictdestroy(*dict_t d) 
void dictdump(dict_t *d) 
int dictlook(dict_t *d, const char * key) 
int dictget(char* s, dict_t *d, const char *key)
dict_t* dictadd(dict_t* d, const char * key, const char * val)
dict_t dictup(dict_t d, const char * key, const char *newval) 
dict_t* dictrm(dict_t* d, const char * key)

The reason is that [:space:] will match any of the characters :, s, p, a, c, or e. This is not what you want.

You want [[:space:]] which will match any whitespace.

Sun/Solaris

The native Sun/Solaris awk is notoriously bug-filled. If you are on that platform, try nawk or /usr/xpg4/bin/awk or /usr/xpg6/bin/awk.

Using sed

A very similar approach can be used with sed. This uses a regex based on yours:

$ sed -n '/^[a-zA-Z_*]\+[ \t]\+[a-zA-Z*]\+ *[(]/p' dict3.c
dictent_t* dictentcreate(const char * key, const char * val)  
dict_t* dictcreate() 
void dictdestroy(*dict_t d) 
void dictdump(dict_t *d) 
int dictlook(dict_t *d, const char * key) 
int dictget(char* s, dict_t *d, const char *key)
dict_t* dictadd(dict_t* d, const char * key, const char * val)
dict_t dictup(dict_t d, const char * key, const char *newval) 
dict_t* dictrm(dict_t* d, const char * key)

The -n option tells sed not to print unless we explicitly ask it to. The construct /.../p tells sed to print the line if the regex inside the slashes is matched.

All the improvements to the regex suggested by Ed Morton apply here also.

Using perl

The above can also be adopted to perl:

perl -ne  'print if /^[a-zA-Z_*]+[ \t]+[a-zA-Z*]+ *[(]/' dict3.c

0 讨论(0)

查看其它2个回答