问题
Recently I start doing some tests for my python scripts. And for some awkward reason, the module that runs python script and checks its output is written in C with addition of some other languages. This way is more convinient for me to use for now.
The single test runs with the below code:
FILE *fd = NULL;
fd = popen("cmd", "r");
if(NULL == fd){
fprintf(stderr, "popen: failed\n");
return 1;
}
fprintf(stderr, "res = %d: %s\n", errno, strerror(errno));
int res = pclose(fd);
fprintf(stderr, "res = %d: %s\n", res, strerror(errno));
As you can see from above, the code just runs a script with the help of popen
and checks its exit status. But one day I had run in a situation where popen
were given wrong arguments. Something like this had happened:
fd = popen("python@$#!", "r");
And the test module had returned:
res = 0: Success
sh: 1: python@0!: not found
res = 32512: Success
So, popen
run happily with the above mistake. And only pclose
returned some exit status. With errno being zero
. Between all of that, the shell also made its output.
Here is my question. How can I detect if a shell failed to execute a command? The failure could be for any reason actually, but the main point is that the script does not event started.
回答1:
General comments about when to use errno
No standard C or POSIX library function ever sets errno
to zero. Printing an error message based on errno
when fd
is not NULL is not appropriate; the error number is not from popen()
(or is not set because popen()
failed). Printing res
after pclose()
is OK; adding strerror(errno)
runs into the same problem (the information in errno
may be entirely irrelevant). You can set errno
to zero before calling a function. If the function returns a failure indication, it may be relevant to look at errno
(look at the specification of the function — is it defined to set errno
on failure?). However, errno
can be set non-zero by a function even if it succeeds. Solaris standard I/O used to set errno = ENOTTY
if the output stream was not connected to a terminal, even though the operation succeeded; it probably still does. And Solaris setting errno
even on success is perfectly legitimate; it is only legitimate to look at errno
if (1) the function reports failure and (2) the function is documented to set errno
(by POSIX or by the system manual).
See C11 §7.5 Errors <errno.h> ¶3:
The value of errno in the initial thread is zero at program startup (the initial value of errno in other threads is an indeterminate value), but is never set to zero by any library function.202) The value of errno may be set to nonzero by a library function call whether or not there is an error, provided the use of errno is not documented in the description of the function in this International Standard.
202) Thus, a program that uses
errno
for error checking should set it to zero before a library function call, then inspect it before a subsequent library function call. Of course, a library function can save the value oferrno
on entry and then set it to zero, as long as the original value is restored iferrno
's value is still zero just before the return.
POSIX is similar (errno):
Many functions provide an error number in
errno
, which has typeint
and is defined in<errno.h>
. The value oferrno
shall be defined only after a call to a function for which it is explicitly stated to be set and until it is changed by the next function call or if the application assigns it a value. The value oferrno
should only be examined when it is indicated to be valid by a function's return value. Applications shall obtain the definition oferrno
by the inclusion of<errno.h>
. No function in this volume of POSIX.1-2017 shall set errno to 0. The setting oferrno
after a successful call to a function is unspecified unless the description of that function specifies thaterrno
shall not be modified.
popen()
and pclose()
The POSIX specification for popen() is not dreadfully helpful. There's only one circumstance under which popen()
'must fail'; everything else is 'may fail'.
However, the details for pclose() are much more helpful, including:
If the command language interpreter cannot be executed, the child termination status returned by
pclose()
shall be as if the command language interpreter terminated usingexit(127)
or_exit(127)
.
and
Upon successful return,
pclose()
shall return the termination status of the command language interpreter. Otherwise,pclose()
shall return -1 and set errno to indicate the error.
That means that pclose()
returns the value it received from waitpid()
— the exit status from the command that was invoked. Note that it must use waitpid()
(or an equivalently selective function — hunt for wait3()
and wait4()
on BSD systems); it is not authorized to wait for any other child processes than the one created by popen()
for this file stream. There are prescriptions about pclose()
must be sure that the child has exited, even if some other function waited on the dead child in the interim and thereby caused the system to lose the status for the child created by popen()
.
If you interpret decimal 32512 as hexadecimal, you get 0x7F00. And if you used the WIFEXITED
and WEXITSTATUS
macros from <sys/wait.h>
on that, you'd find that the exit status is 127
(because 0x7F
is 127
decimal, and the exit status is encoded in the high-order bits of the status returned by waitpid()
.
int res = pclose(fd);
if (WIFEXITED(res))
printf("Command exited with status %d (0x%.4X)\n", WEXITSTATUS(res), res);
else if (WIFSIGNALED(res))
printf("Command exited from signal %d (0x%.4X)\n", WTERMSIG(res), res);
else
printf("Command exited with unrecognized status 0x%.4X\n", res);
And remember that 0
is the exit status indicating success; anything else normally indicates an error of some sort. You can further analyze the exit status to look for 127
or relayed signals, etc. It's unlikely you'd get a 'signalled' status, or an unrecognized status.
popen()
told you that the child failed.
Of course, it is possible that the executed command actually exited itself with status 127; that's unavoidably confusing, and the only way around it is to avoid exit statuses in the range 126 to 128 + 'maximum signal number' (which might mean 126 .. 191 if there are 63 recognized signals). The value 126
is also used by POSIX to report when the interpreter specified in a shebang (#!/usr/bin/interpreter
) is missing (as opposed to the program to be executed not being available). Whether that's returned by pclose()
is a separate discussion. And the signal reporting is done by the shell because there's no (easy) way to report that a child died from a signal otherwise.
来源:https://stackoverflow.com/questions/60013021/how-to-detect-if-shell-failed-to-execute-a-command-after-popen-call-not-to-conf