Since the Standard C committee did not standardize a simple replacement for gets(), what should it be?

…衆ロ難τιáo~ 提交于 2019-12-18 18:47:32

问题


The gets function was first deprecated in C99 and finally removed in C11. Yet there is no direct replacement for it in the C library.

fgets() is not a drop-in replacement because it does not strip the final '\n', which may be absent at the end of file. Many programmers get it wrong too.

There is a one-liner to remove the linefeed: buf[strcspn(buf, "\n")] = '\0';, but it is non-trivial and often calls for an explanation. It may be inefficient as well.

This is counter-productive. Many beginners still use gets() because their teachers are lame or their tutorials obsolete.

Microsoft proposed gets_s() and many related functions, but it does not silently truncate overlong lines, the behavior on this constraint violation is not exactly simple.

Both BSD and the GNU libc have getline, standardized in POSIX, that allocates or reallocates a buffer via realloc...

What is the best way to teach beginners about this mess?


回答1:


The nature of the question is such that there's going to be speculations and opinions. But we could find some information from the C99 rationale and C11 standard.

The C99 rationale, when gets() was deprecated, states the following reason for the deprecating it:

Because gets does not check for buffer overrun, it is generally unsafe to use when its input is not under the programmer’s control. This has caused some to question whether it should appear in the Standard at all. The Committee decided that gets was useful and convenient in those special circumstances when the programmer does have adequate control over the input, and as longstanding existing practice, it needed a standard specification. In general, however, the preferred function is fgets (see §7.19.7.2).

I don't think gets_s() can be considered as an alternative either. Because gets_s() is an optional interface. C11 actually recommends fgets() over gets_s():

§K.3.5.4.1, C11 draft

The fgets function allows properly-written programs to safely process input lines too long to store in the result array. In general this requires that callers of fgets pay attention to the presence or absence of a new-line character in the result array. Consider using fgets (along with any needed processing based on new-line characters) instead of gets_s.

So that leaves us with fgets() as the only real replacement for gets() in ISO C. fgets() is equivalent to gets() except it would read in the newline if there's buffer space. So is it worth introducing a new interface that has a minor improvement over a longstanding and widely used (fgets()) one? IMO, no.

Besides, a lot of real world applications are not restricted to ISO C alone. So there's an opportunity to use extensions and POSIX getline() etc as a replacement.

If it becomes necessary to find write a solution within ISO C, then it's quite easy to write a wrapper around fgets() anyway such as my_fgets() that would remove the newline, if present.

Of course, teaching fgets() to newcomers involves explaining about the potential newline issue. But IMO, it's not that hard to understand and someone intending to do learn C should be able to grasp it quickly. It (finding the last character and replace it if it's character "X") could even be considered as a good exercise for a beginner.

So in light of the above stated reasons, I would say there's no overwhelming necessity for a new function in ISO C as a true replacement for gets().




回答2:


This question largely calls for speculation short of a citation from committee minutes or something, but as a general principle, the committee (WG14) generally avoids inventing new interfaces and prefers to document and make rigorous existing practice (things like snprintf, long long, the inttypes.h types, etc.) and sometimes adopt from other standards/interface definitions outside of C (e.g. complex math from IEEE floating point, atomic model from C++, etc.). gets has no such replacement to adopt, probably because fgets is generally considered superior (it's non-lossy when the file ends without a newline). If you really want a direct replacement, something like this works:

char buf[100];
scanf("%99[^\n]%*1[\n]", buf);

Of course it's klunky to use, especially when the buffer size is variable.




回答3:


IMO, any replacement would need to pass the sizeas well as the char * destination necessitating code changes that were significantly dependent on a case by case basis. A one-size-fits all was not deemed possible as the size is often lost/not passed by the time code gets to gets(). Given the we had a 12 year warning (C99 to C11), suspect the committee felt the problem would be gone by 2011.

Ha!

The Standard C committee should have made a replacement that also passed in the size of the destination. Like the following. (this likely has a name collision issue)

char *gets_replacement(char *s, size_t size);

I attempted a fgets() based replacement that takes advantage of VLA (optional in C11)

char *my_gets(char *dest, size_t size) {
  // +2 one for \n and 1 to detect overrun
  char buf[size + 2];

  if (fgets(buf, sizeof buf, stdin) == NULL) {
    // improve error handling - see below comment
    if (size > 0) {
      *buf = '\0';
    }
    return NULL;
  }
  size_t len = strlen(buf);
  if (len > 0 && buf[len - 1] == '\n') {
    buf[--len] = '\0';
  }

  // If input would have overrun the original gets()
  if (len >= size) {
    // or call error handler
    if (size > 0) {
      *buf = '\0';
    }
    return NULL;  
  }
  return memcpy(dest, buf, len + 1);
}


来源:https://stackoverflow.com/questions/34031129/since-the-standard-c-committee-did-not-standardize-a-simple-replacement-for-gets

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!