ungetc: number of bytes of pushback

后端 未结 3 1556
闹比i
闹比i 2021-01-13 17:49

ungetc is only guaranteed to take one byte of pushback. On the other hand, I\'ve tested it on Windows and Linux and it seems to work with two bytes.

Are there any pl

相关标签:
3条回答
  • 2021-01-13 18:09

    The C99 standard (and the C89 standard before that) said unequivocally:

    One character of pushback is guaranteed. If the ungetc function is called too many times on the same stream without an intervening read or file positioning operation on that stream, the operation may fail.

    So, to be portable, you do not assume more than one character of pushback.

    Having said that, on both MacOS X 10.7.2 (Lion) and RHEL 5 (Linux, x86/64), I tried:

    #include <stdio.h>
    int main(void)
    {
        int i;
        for (i = 0; i < 4096; i++)
        {
            int c = i % 16 + 64;
            if (ungetc(c, stdin) != c)
            {
                fprintf(stderr, "Error at count = %d\n", i);
                return(1);
            }
        }
        printf("No error up to count = %d\n", i-1);
        return(0);
    }
    

    I got no error on either platform. By contrast, on Solaris 10 (SPARC), I got an error at 'count = 4'. Worse, on HP-UX 11.00 (PA-RISC) and HP-UX 11.23 (Itanium), I got an error at 'count = 1' - belying the theory that 2 is safe. Similarly, AIX 6.0 gave an error at 'count = 1'.

    Summary

    • Linux: big (4 KiB)
    • MaxOS X: big (4 KiB)
    • Solaris: 4
    • HP-UX: 1
    • AIX: 1

    So, AIX and HP-UX only allow one character of pushback on an input file that has not had any data read on it. This is a nasty case; they might provide much more pushback capacity once some data has been read from the file (but a simple test on AIX adding a getchar() before the loop didn't change the pushback capacity).

    0 讨论(0)
  • 2021-01-13 18:19

    Implementations which support 2 characters of pushback probably do so in order than scanf can use ungetc for its pushback rather than requiring a second nearly-identical mechanism. What this means for you as the application programmer is that even if calling ungetc twice seems to work, it might not be reliable in all situations -- for example, if the last operation on the stream was fscanf and it had to use pushback, you can probably only ungetc one character.

    In any case, it's nonportable to rely on having more than one character of ungetc pushback, so I would highly advise against writing code that needs it...

    0 讨论(0)
  • 2021-01-13 18:24

    There are some posts here suggesting that it makes sense to support 2 chars for the sake of scanf.

    I don't think this is right: scanf only needs one, and this is indeed the reason for the limit. The original implementation (back in the mid 70s) supported 100, and the manual had a note: in the future we may decide to support only 1, since that's all that scanf needs. See page 3 of the original manual (Maybe not original, but pretty old.)

    To see more vividly that scanf needs only 1 char, consider this code for the %u feature of scanf.

    int c;
    while isspace(c=getc()) {} // skip white space
    unsigned num = 0;
    while isdigit(c)
        num = num*10 + c-'0',
        c = getc();
    ungetc(c);
    

    Only a single call to ungetc() is needed here. There is no reason why scanf needs a char all to itself: it can share with the user.

    0 讨论(0)
提交回复
热议问题