The question
What is the difference between Cwd::cwd
and Cwd::getcwd
in Perl, generally, without regard to any specific platform? Why does Perl have both? What is the intended use, which one should I use in which scenarios? (Example use cases will be appreciated.) Does it matter? (Assuming I don’t mix them.) Does choice of either one affect portability in any way? Which one is more commonly used in modules?
Even if I interpret the manual is saying that except for corner cases cwd
is `pwd`
and getcwd
just calls getcwd
from unistd.h
, what is the actual difference? This works only on POSIX systems, anyway.
I can always read the implementation but that tells me nothing about the meaning of those functions. Implementation details may change, not so defined meaning. (Otherwise a breaking change occurs, which is serious business.)
What does the manual say
Quoting Perl’s Cwd module manpage:
Each of these functions are called without arguments and return the absolute path of the current working directory.
getcwd
my $cwd = getcwd();
Returns the current working directory.
Exposes the POSIX function getcwd(3) or re-implements it if it's not available.
cwd
my $cwd = cwd();
The cwd() is the most natural form for the current architecture. For most systems it is identical to `pwd` (but without the trailing line terminator).
And in the Notes section:
- Actually, on Mac OS, the
getcwd()
, fastgetcwd()
and fastcwd()
functions are all aliases for the cwd()
function, which, on Mac OS, calls `pwd`. Likewise, the abs_path()
function is an alias for fast_abs_path()
OK, I know that on Mac OS1 there is no difference between getcwd()
and cwd()
as both actually boil down to `pwd`
. But what on other platforms? (I’m especially interested in Debian Linux.)
1 Classic Mac OS, not OS X. $^O
values are MacOS
and darwin
for Mac OS and OS X, respectively. Thanks, @tobyink and @ikegami.
And a little meta-question: How to avoid asking similar questions for other modules with very similar functions? Is there a universal way of discovering the difference, other than digging through the implementation? (Currently, I think that if the documentation is not clear about intended use and differences, I have to ask someone more experienced or read the implementation myself.)
Generally speaking
I think the idea is that cwd()
always resolves to the external, OS-specific way of getting the current working directory. That is, running pwd
on Linux, command /c cd
on DOS, /usr/bin/fullpath -t
in QNX, and so on ― all examples are from actual Cwd.pm
. The getcwd()
is supposed to use the POSIX system call if it is available, and falls back to the cwd()
if not.
Why we have both? In the current implementation I believe exporting just getcwd()
would be enough for most of systems, but who knows why the logic of “if syscall is available, use it, else run cwd()
” can fail on some system (e.g. on MorphOS in Perl 5.6.1).
On Linux
On Linux, cwd()
will run `/bin/pwd`
(will actually execute the binary and get its output), while getcwd()
will issue getcwd(2)
system call.
Actual effect inspected via strace
One can use strace(1)
to see that in action:
Using cwd()
:
$ strace -f perl -MCwd -e 'cwd(); ' 2>&1 | grep execve execve("/usr/bin/perl", ["perl", "-MCwd", "-e", "cwd(); "], [/* 27 vars */]) = 0 [pid 31276] execve("/bin/pwd", ["/bin/pwd"], [/* 27 vars */] <unfinished ...> [pid 31276] <... execve resumed> ) = 0
Using getcwd()
:
$ strace -f perl -MCwd -e 'getcwd(); ' 2>&1 | grep execve execve("/usr/bin/perl", ["perl", "-MCwd", "-e", "getcwd(); "], [/* 27 vars */]) = 0
Reading Cwd.pm
source
You can take a look at the sources (Cwd.pm
, e.g. in CPAN) and see that for Linux cwd()
call is mapped to _backtick_pwd
which, as the name suggests, calls the pwd
in backticks.
Here is a snippet from Cwd.pm
, with my comments:
unless ($METHOD_MAP{$^O}{cwd} or defined &cwd) { ... # some logic to find the pwd binary here, $found_pwd_cmd is set to 1 on Linux ... if( $os eq 'MacOS' || $found_pwd_cmd ) { *cwd = \&_backtick_pwd; # on Linux we actually go here } else { *cwd = \&getcwd; } }
Performance benchmark
Finally, the difference between two is that cwd()
, which calls another binary, must be slower. We can make some kind of a performance test:
$ time perl -MCwd -e 'for (1..10000) { cwd(); }' real 0m7.177s user 0m0.380s sys 0m1.440s
Now compare it with the system call:
$ time perl -MCwd -e 'for (1..10000) { getcwd(); }' real 0m0.018s user 0m0.009s sys 0m0.008s
Discussion, choice
But as you don't usually query the current working directory too often, both options will work ― unless you cannot spawn any more processes for some reason related to ulimit
, out of memory situation, etc.
Finally, as for selecting which one to use: for Linux, I would always use getcwd()
. I suppose you will need to make your tests and select which function to use if you are going to write a portable piece of code that will run on some really strange platform (here, of course, Linux, OS X, and Windows are not in the list of strange platforms).