How to detect if shell failed to execute a command after popen call? Not to confuse with the command exit status

问题

Recently I start doing some tests for my python scripts. And for some awkward reason, the module that runs python script and checks its output is written in C with addition of some other languages. This way is more convinient for me to use for now.

The single test runs with the below code:

 FILE *fd = NULL;

 fd = popen("cmd", "r");
 if(NULL == fd){
  fprintf(stderr, "popen: failed\n");
  return 1;
 }
 fprintf(stderr, "res = %d: %s\n", errno, strerror(errno));

 int res = pclose(fd);
 fprintf(stderr, "res = %d: %s\n", res, strerror(errno));

As you can see from above, the code just runs a script with the help of popen and checks its exit status. But one day I had run in a situation where popen were given wrong arguments. Something like this had happened:

fd = popen("python@$#!", "r");

And the test module had returned:

res = 0: Success
sh: 1: python@0!: not found
res = 32512: Success

So, popen run happily with the above mistake. And only pclose returned some exit status. With errno being zero. Between all of that, the shell also made its output.

Here is my question. How can I detect if a shell failed to execute a command? The failure could be for any reason actually, but the main point is that the script does not event started.

回答1:

General comments about when to use `errno`

No standard C or POSIX library function ever sets errno to zero. Printing an error message based on errno when fd is not NULL is not appropriate; the error number is not from popen() (or is not set because popen() failed). Printing res after pclose() is OK; adding strerror(errno) runs into the same problem (the information in errno may be entirely irrelevant). You can set errno to zero before calling a function. If the function returns a failure indication, it may be relevant to look at errno (look at the specification of the function — is it defined to set errno on failure?). However, errno can be set non-zero by a function even if it succeeds. Solaris standard I/O used to set errno = ENOTTY if the output stream was not connected to a terminal, even though the operation succeeded; it probably still does. And Solaris setting errno even on success is perfectly legitimate; it is only legitimate to look at errno if (1) the function reports failure and (2) the function is documented to set errno (by POSIX or by the system manual).

See C11 §7.5 Errors <errno.h> ¶3:

The value of errno in the initial thread is zero at program startup (the initial value of errno in other threads is an indeterminate value), but is never set to zero by any library function.²⁰²⁾ The value of errno may be set to nonzero by a library function call whether or not there is an error, provided the use of errno is not documented in the description of the function in this International Standard.

²⁰²⁾ Thus, a program that uses errno for error checking should set it to zero before a library function call, then inspect it before a subsequent library function call. Of course, a library function can save the value of errno on entry and then set it to zero, as long as the original value is restored if errno's value is still zero just before the return.

POSIX is similar (errno):

Many functions provide an error number in errno, which has type int and is defined in <errno.h>. The value of errno shall be defined only after a call to a function for which it is explicitly stated to be set and until it is changed by the next function call or if the application assigns it a value. The value of errno should only be examined when it is indicated to be valid by a function's return value. Applications shall obtain the definition of errno by the inclusion of <errno.h>. No function in this volume of POSIX.1-2017 shall set errno to 0. The setting of errno after a successful call to a function is unspecified unless the description of that function specifies that errno shall not be modified.

`popen()` and `pclose()`

The POSIX specification for popen() is not dreadfully helpful. There's only one circumstance under which popen() 'must fail'; everything else is 'may fail'.

However, the details for pclose() are much more helpful, including:

If the command language interpreter cannot be executed, the child termination status returned by pclose() shall be as if the command language interpreter terminated using exit(127) or _exit(127).

and

Upon successful return, pclose() shall return the termination status of the command language interpreter. Otherwise, pclose() shall return -1 and set errno to indicate the error.

That means that pclose() returns the value it received from waitpid() — the exit status from the command that was invoked. Note that it must use waitpid() (or an equivalently selective function — hunt for wait3() and wait4() on BSD systems); it is not authorized to wait for any other child processes than the one created by popen() for this file stream. There are prescriptions about pclose() must be sure that the child has exited, even if some other function waited on the dead child in the interim and thereby caused the system to lose the status for the child created by popen().

If you interpret decimal 32512 as hexadecimal, you get 0x7F00. And if you used the WIFEXITED and WEXITSTATUS macros from <sys/wait.h> on that, you'd find that the exit status is 127 (because 0x7F is 127 decimal, and the exit status is encoded in the high-order bits of the status returned by waitpid().

int res = pclose(fd);

if (WIFEXITED(res))
    printf("Command exited with status %d (0x%.4X)\n", WEXITSTATUS(res), res);
else if (WIFSIGNALED(res))
    printf("Command exited from signal %d (0x%.4X)\n", WTERMSIG(res), res);
else
    printf("Command exited with unrecognized status 0x%.4X\n", res);

And remember that 0 is the exit status indicating success; anything else normally indicates an error of some sort. You can further analyze the exit status to look for 127 or relayed signals, etc. It's unlikely you'd get a 'signalled' status, or an unrecognized status.

`popen()` told you that the child failed.

Of course, it is possible that the executed command actually exited itself with status 127; that's unavoidably confusing, and the only way around it is to avoid exit statuses in the range 126 to 128 + 'maximum signal number' (which might mean 126 .. 191 if there are 63 recognized signals). The value 126 is also used by POSIX to report when the interpreter specified in a shebang (#!/usr/bin/interpreter) is missing (as opposed to the program to be executed not being available). Whether that's returned by pclose() is a separate discussion. And the signal reporting is done by the shell because there's no (easy) way to report that a child died from a signal otherwise.

来源：https://stackoverflow.com/questions/60013021/how-to-detect-if-shell-failed-to-execute-a-command-after-popen-call-not-to-conf

标签

Linux

shell

popen

How to detect if shell failed to execute a command after popen call? Not to confuse with the command exit status

问题

回答1:

General comments about when to use errno

popen() and pclose()

popen() told you that the child failed.

General comments about when to use `errno`

`popen()` and `pclose()`

`popen()` told you that the child failed.