I have read a few lines of text into an array of C-strings. The lines have an arbitrary number of tab or space-delimited columns, and I am trying to figure out how to remove
If I may voice the "you're doing it wrong" opinion, why not just eliminate the whitespace while reading? Use fscanf("%s", string);
to read a "word" (non whitespace), then read the whitespace. If it's spaces or tabs, keep reading into one "line" of data. If it's a newline, start a new entry. It's probably easiest in C to get the data into a format you can work with as soon as possible, rather than trying to do heavy-duty text manipulation.
Why not use strtok()
directly? No need to modify the input
All you need to do is repeat strtok()
until you get 3 non-space tokens and then you are done!
The following code modifies the string in place; if you don't want to destroy your original input, you can pass a second buffer to receive the modified string. Should be fairly self-explanatory:
#include <stdio.h>
#include <string.h>
char *squeeze(char *str)
{
int r; /* next character to be read */
int w; /* next character to be written */
r=w=0;
while (str[r])
{
if (isspace(str[r]) || iscntrl(str[r]))
{
if (w > 0 && !isspace(str[w-1]))
str[w++] = ' ';
}
else
str[w++] = str[r];
r++;
}
str[w] = 0;
return str;
}
int main(void)
{
char test[] = "\t\nThis\nis\ta\b test.";
printf("test = %s\n", test);
printf("squeeze(test) = %s\n", squeeze(test));
return 0;
}
char* trimwhitespace(char *str_base) {
char* buffer = str_base;
while((buffer = strchr(str_base, ' '))) {
strcpy(buffer, buffer+1);
}
return str_base;
}
I made a small improvment over John Bode's to remove trailing whitespace as well:
#include <ctype.h>
char *squeeze(char *str)
{
char* r; /* next character to be read */
char* w; /* next character to be written */
char c;
int sp, sp_old = 0;
r=w=str;
do {
c=*r;
sp = isspace(c);
if (!sp) {
if (sp_old && c) {
// don't add a space at end of string
*w++ = ' ';
}
*w++ = c;
}
if (str < w) {
// don't add space at start of line
sp_old = sp;
}
r++;
}
while (c);
return str;
}
#include <stdio.h>
int main(void)
{
char test[] = "\t\nThis\nis\ta\f test.\n\t\n";
//printf("test = %s\n", test);
printf("squeeze(test) = '%s'\n", squeeze(test));
return 0;
}
br.
Edit: I originally had a malloced workspace, which I though might be clearer. However, doing it w/o extra memory is almost as simple, and I'm being pushed that way in comments and personal IMs, so, here comes...:-)
void squeezespaces(char* row, char separator) {
char *current = row;
int spacing = 0;
int i;
for(i=0; row[i]; ++i) {
if(row[i]==' ') {
if (!spacing) {
/* start of a run of spaces -> separator */
*current++ = separator
spacing = 1;
}
} else {
*current++ = row[i];
spacing = 0;
}
*current = 0;
}