I have a script which I use to retrieve specific files via SFTP on a regular basis. On occasion, the script will error out with the following output:
Traceba
The error message IOError: size mismatch in get! 950272 != 1018742
is being thrown by the get-function of the Paramiko-library if the size of the copied file on the local directory does not match the prefetched size of the remote file:
with open(localpath, "wb") as fl:
size = self.getfo(remotepath, fl, callback)
s = os.stat(localpath)
if s.st_size != size:
raise IOError(
"size mismatch in get! {} != {}".format(s.st_size, size)
)
Why does this happen if there is no issue regarding the connection and the transfer-process?
While checking the Paramiko-code and trying to debug this issue a strange behaviour of my local file system caught my attention. With every copied file from the remote file system, the local file system took some time processing the file registering the correct file-size.
This behaviour leads me to my assumption, that while the get-function of the Paramiko-library does process the file correctly it does not wait for the local file system to adapt and hence may get the status (including the size) of the local file right after the file was finished being processed by the getfo-function using s = os.stat(localpath)
.
This could lead to inconsistencies between the local file-size and the correctly prefetched remote file-size and therefore could throw the IOError "size mismatch in get! {} != {}".format(s.st_size, size)
.
It would also explain why the Error cannot be reproduced consistently because the Python interpreter always works with different environments regarding the synchronicity of the local operating system.
How was I able to solve this issue for me?
I manipulated the Paramiko-code of the get-function which can be found on line 785 in the "sftp_client.py" and added localsize = fl.tell()
within the file-handling updating the size-checking accordingly:
with open(localpath, "wb") as fl:
size = self.getfo(remotepath, fl, callback)
localsize = fl.tell()
if localsize != size:
raise IOError(
"size mismatch {} != {}".format(localsize, size)
)
This should avoid the somehow flawed local file-size check s = os.stat(localpath)
replacing it with a properly working one that uses the file-object during file-handling to get the size of the local file.