问题
Q: Why would re-saving a file be different vs a direct extraction from a zip file? Particularly on Windows?
Context
I have an angular application that prepares a text file for import into a commercial machine. For user convenience, we provide the file inside a zip file so that the required folder structure can be provided to the user. They write this file to a USB drive and use that to import into the machine.
Problem
If the downloaded zip file is extracted directly onto the USB (to get the file and the required folder structure), the machine cannot recognize the embedded text file.
Troubleshooting
If I open the file in any text editor, add a space, delete the space, and re-save the file on the USB, then the machine will recognize the file. Alternatively, if I extract the zip onto the local file system, then copy the folder structure from the local file system to the USB, then the machine also will recognize it.
If I switch to Linux, then a 'write out' from nano will fix the file. If I use the touch command on the file, the problem remains.
Suspecting a whitespace/line-ending issue, I've tried several diff tools which reveal no apparent differences:
$ diff original.txt resaved.txt
(Linux)$ vbindiff original.txt resaved.txt
(Linux)> fc /b original.txt resaved.txt
(Windows 7)
Other info:
- Angular version: 5.2.10
- Zip Utility in angular: JSZip 3.1.5
- Unzip Utils: 7-Zip and Native Windows Explorer extract
JSZip code:
const zip = new JSZip();
zip.folder('FolderA/FolderB/FolderC').file('FILE.TXT', new File([contentString], 'TEMP.TXT', { type: 'text/plain' }));
zip.generateAsync({ type: 'blob' })
.then(function (content) {
saveAs(content, 'ZipFile.ZIP');
});
At this point, I'm out of ideas. Hoping someone here may have some insight into this odd behavior.
回答1:
TL;DR: Check the file attributes (e.g. Archive, Read-Only, Hidden, System, etc).
Our system was specifically looking for the Archive bit and modifying the file in any way set this bit.
This was an ugly one to ferret out, but chatting with our embedded systems programmer for a bit led to the answer.
Our machine was specifically searching for the archive bit (Windows file attribute) when it was searching for files to import. This bit is a relic from Windows NTFS and is near obsolete. For all intents and purposes it is a dirty flag used to point out files that should be archived/backed up in the next backup run. There are much better ways to do this, so it has fallen out of style.
However, for whatever reason, our system is searching only for files with that bit set. That's why opening/copying/moving the file would fix the problem, because altering it in any way set this archive bit (dirty flag).
If you want to learn more about it, see here and here.
So, the moral of the story is to check these file attributes if you have a similar issue.
We are using the Harmony USB driver from Microchip, so this may be a nuance of that tool (or maybe just an artifact from one of the online examples).
You can see it this using the file properties in Windows Explorer or with the > attrib <file>
command in Windows command prompt.
To fix:
Windows: You can set the value from the command prompt using > attrib +a <file>
or remove it using > attrib -a <file>
.
If using node.js on a Windows host, you can use the winattr library from NPM to manipulate these attributes.
Linux: You can use $ getfattr
and $ setfattr
to set the bit (see here and here).
- Note: the answers I linked say to use
$ setfattr -h -v 0x00000020 -n system.ntfs_attrib_be <target-file>
but I got an operation not supported when I tried to do the same. I ended up using the java solution, but when I inspected the file afterward, it seemed the equivalent command would have been$ setfattr -n user.DOSATTRIB -v 0sMHgyMAA= <target-file>
. Your mileage may vary but I offer it in case it helps anyone.
Java: You can also use Java from any system.
来源:https://stackoverflow.com/questions/51314977/file-extracted-from-zip-not-recognized-until-re-save