问题
How do i parse tokens from an input string. For example:
char *aString = "Hello world".
I want the output to be:
"Hello" "world"
回答1:
You are going to want to use strtok
- here is a good example.
回答2:
Take a look at strtok, part of the standard library.
回答3:
strtok
is the easy answer, but what you really need is a lexer that does it properly. Consider the following:
- are there one or two spaces between "hello" and "world"?
- could that in fact be any amount of whitespace?
- could that include vertical whitespace (\n, \f, \v) or just horizontal (\s, \t, \r)?
- could that include any UNICODE whitespace characters?
- if there were punctuation between the words, ("hello, world"), would the punctuation be a separate token, part of "hello,", or ignored?
As you can see, writing a proper lexer is not straightforward, and strtok
is not a proper lexer.
Other solutions could be a single character state machine that does precisely what you need, or regex-based solution that makes locating words versus gaps more generalized. There are many ways.
And of course, all of this depends on what your actual requirements are, and I don't know them, so start with strtok
. But it's good to be aware of the various limitations.
回答4:
For re-entrant versions you can either use strtok_s for visual studio or strtok_r for unix
回答5:
Keep in mind that strtok is very hard to get it right, because:
- It modifies the input
- The delimiter is replaced by a null terminator
- Merges adjacent delimiters, and of course,
- Is not thread safe.
You can read about this alternative.
来源:https://stackoverflow.com/questions/558368/how-do-i-parse-a-token-from-a-string-in-c