问题
I'm in the process of writing a string tokenizer without using strtok(). This is mainly for my own betterment and for a greater understanding of pointers. I think I almost have it, but I've been receiving the following errors:
myToc.c:25 warning: assignment makes integer from pointer without a cast
myToc.c:35 (same as above)
myToc.c:44 error: invalid type argument of 'unary *' (have 'int')
What I'm doing is looping through the string sent to the method, finding each delimiter, and replacing it with '\0.' The "ptr" array is supposed to have pointers to the separated substrings. This is what I have so far.
#include <string.h>
void myToc(char * str){
int spcCount = 0;
int ptrIndex = 0;
int n = strlen(str);
for(int i = 0; i < n; i++){
if(i != 0 && str[i] == ' ' && str[i-1] != ' '){
spcCount++;
}
}
//Pointer array; +1 for \0 character, +1 for one word more than number of spaces
int *ptr = (int *) calloc(spcCount+2, sizeof(char));
ptr[spcCount+1] = '\0';
//Used to differentiate separating spaces from unnecessary ones
char temp;
for(int j = 0; j < n; j++){
if(j == 0){
/*Line 25*/ ptr[ptrIndex] = &str[j];
temp = str[j];
ptrIndex++;
}
else{
if(str[j] == ' '){
temp = str[j];
str[j] = '\0';
}
else if(str[j] != ' ' && str[j] != '\0' && temp == ' '){
/*Line 35*/ ptr[ptrIndex] = &str[j];
temp = str[j];
ptrIndex++;
}
}
}
int k = 0;
while(ptr[k] != '\0'){
/*Line 44*/ printf("%s \n", *ptr[k]);
k++;
}
}
I can see where the errors are occurring but I'm not sure how to correct them. What should I do? Am I allocating memory correctly or is it just an issue with how I'm specifying the addresses?
回答1:
You pointer array is wrong. It looks like you want:
char **ptr = calloc(spcCount+2, sizeof(char*));
Also, if I am reading your code correctly, there is no need for the null byte as this array is not a string.
In addition, you'll need to fix:
while(ptr[k] != '\0'){
/*Line 44*/ printf("%s \n", *ptr[k]);
k++;
}
The dereference is not required and if you remove the null ptr, this should work:
for ( k = 0; k < ptrIndex; k++ ){
/*Line 44*/ printf("%s \n", ptr[k]);
}
回答2:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void myToc(char * str){
int spcCount = 0;
int ptrIndex = 0;
int n = strlen(str);
for(int i = 0; i < n; i++){
if(i != 0 && str[i] == ' ' && str[i-1] != ' '){
spcCount++;
}
}
char **ptr = calloc(spcCount+2, sizeof(char*));
//ptr[spcCount+1] = '\0';//0 initialized by calloc
char temp = ' ';//can simplify the code
for(int j = 0; j < n; j++){
if(str[j] == ' '){
temp = str[j];
str[j] = '\0';
} else if(str[j] != '\0' && temp == ' '){//can omit `str[j] != ' ' &&`
ptr[ptrIndex++] = &str[j];
temp = str[j];
}
}
int k = 0;
while(ptr[k] != NULL){//better use NULL
printf("%s \n", ptr[k++]);
}
free(ptr);
}
int main(){
char test1[] = "a b c";
myToc(test1);
char test2[] = "hello world";
myToc(test2);
return 0;
}
回答3:
Update: I tried this at http://www.compileonline.com/compile_c99_online.php
with the fixes for lines 25, 35, and 44, and with a main function that called
myToc() twice. I initially encountered segfaults when trying to write null characters
to str[]
, but that was only because the strings I was passing were (apparently
non-modifiable) literals. The code below worked as desired when I allocated a text buffer and wrote the strings there before passing them in. This version also could be modified to return the array of pointers, which then would point to the tokens.
(The code below also works even when the string parameter is non-modifiable, as long as
myToc()
makes a local copy of the string; but that would not have the desired effect if the purpose of the function is to return the list of tokens rather than just print them.)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void myToc(char * str){
int spcCount = 0;
int ptrIndex = 0;
int n = strlen(str);
for(int i = 0; i < n; i++){
if(i != 0 && str[i] == ' ' && str[i-1] != ' '){
spcCount++;
}
}
//Pointer array; +1 for one word more than number of spaces
char** ptr = (char**) calloc(spcCount+2, sizeof(char*));
//Used to differentiate separating spaces from unnecessary ones
char temp;
for(int j = 0; j < n; j++){
if(j == 0){
ptr[ptrIndex] = &str[j];
temp = str[j];
ptrIndex++;
}
else{
if(str[j] == ' '){
temp = str[j];
str[j] = '\0';
}
else if(str[j] != ' ' && str[j] != '\0' && temp == ' '){
ptr[ptrIndex] = &str[j];
temp = str[j];
ptrIndex++;
}
}
}
for (int k = 0; k < ptrIndex; ++k){
printf("%s \n", ptr[k]);
}
}
int main (int n, char** v)
{
char text[256];
strcpy(text, "a b c");
myToc(text);
printf("-----\n");
strcpy(text, "hello world");
myToc(text);
}
I would prefer simpler code, however. Basically you want a pointer to the first non-blank character in str[]
, then a pointer to each non-blank (other than the first) that is preceded by a blank. Your first loop almost gets this idea except it is looking for blanks preceded by non-blanks. (Also you could start that loop at i = 1
and avoid having to test i != 0
on each iteration.)
I might just allocate an array of char*
of size sizeof(char*) * (n + 1)/2
to hold the pointers rather than looping over the string twice (that is, I'd omit the first loop, which is just to figure out the size of the array). In any case, if ptr[0]
is non-blank I would write its address to the array; then looping for (int j = 1; j < n; ++j)
, write the address of str[j]
to the array if str[j]
is non-blank and str[j - 1]
is blank--basically what you are doing, but with fewer if
s and fewer auxiliary variables.
Less code means less opportunity to introduce a bug, as long as the code is clean and makes sense.
Previous remarks:
int *ptr =
declares an array of int
. For an array of pointers to char
, you want
char** ptr = (char**) calloc(spcCount+2, sizeof(char*));
The comment prior to that line also seems to indicate some confusion. There is no terminating null in your array of pointers, and you don't need to allocate space for one, so possibly spcCount+2
could be spcCount + 1
.
This also is suspect:
while(ptr[k] != '\0')
It looks like it would work, given the way you used calloc
(you do need spcCount+2
to make this work), but I would feel more secure writing something like this:
for (k = 0; k < ptrIndex; ++k)
I do not thing that is what caused the segfault, it just makes me a little uneasy to compare a pointer (ptr[k]
) with \0
(which you would normally compare against a char
).
来源:https://stackoverflow.com/questions/25752442/string-tokenizer-without-using-strtok