How do I parse a token from a string in C?

坚强是说给别人听的谎言 提交于 2019-12-10 15:46:53

问题


How do i parse tokens from an input string. For example:

char *aString = "Hello world".

I want the output to be:

"Hello" "world"


回答1:


You are going to want to use strtok - here is a good example.




回答2:


Take a look at strtok, part of the standard library.




回答3:


strtok is the easy answer, but what you really need is a lexer that does it properly. Consider the following:

  • are there one or two spaces between "hello" and "world"?
  • could that in fact be any amount of whitespace?
  • could that include vertical whitespace (\n, \f, \v) or just horizontal (\s, \t, \r)?
  • could that include any UNICODE whitespace characters?
  • if there were punctuation between the words, ("hello, world"), would the punctuation be a separate token, part of "hello,", or ignored?

As you can see, writing a proper lexer is not straightforward, and strtok is not a proper lexer.

Other solutions could be a single character state machine that does precisely what you need, or regex-based solution that makes locating words versus gaps more generalized. There are many ways.

And of course, all of this depends on what your actual requirements are, and I don't know them, so start with strtok. But it's good to be aware of the various limitations.




回答4:


For re-entrant versions you can either use strtok_s for visual studio or strtok_r for unix




回答5:


Keep in mind that strtok is very hard to get it right, because:

  • It modifies the input
  • The delimiter is replaced by a null terminator
  • Merges adjacent delimiters, and of course,
  • Is not thread safe.

You can read about this alternative.



来源:https://stackoverflow.com/questions/558368/how-do-i-parse-a-token-from-a-string-in-c

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!