Proper way to read a file using FileReader() to generate an md5 hash string from image files?

我与影子孤独终老i 提交于 2020-06-22 04:28:17

问题


I'm currently doing this (see snippet below) to get an md5 hash string for the image files I'm uploading (I'm using the hash as fileNames):

NOTE: I'm using the md5 package to generate the hash (it's loaded into the snippet).

There are 4 available methods on FileReader() to read the files. They all seem to produce good results.

  • readAsText(file)
  • readAsBinaryString(file);
  • readAsArrayBuffer(file);
  • readAsDataURL(file);

Which is should I be using in this case and why? Can you also explain the difference between them?

function onFileSelect(e) {
  const file = e.target.files[0];
  const reader1 = new FileReader();
  const reader2 = new FileReader();
  const reader3 = new FileReader();
  const reader4 = new FileReader();
  
  reader1.onload = (event) => {
    const fileContent = event.target.result;
    console.log('Hash from "readAsText()": ');
    console.log(md5(fileContent));
  }
  
  reader2.onload = (event) => {
    const fileContent = event.target.result;
    console.log('Hash from "readAsBinaryString()": ');
    console.log(md5(fileContent));
  }
  
  reader3.onload = (event) => {
    const fileContent = event.target.result;
    console.log('Hash from "readAsArrayBuffer()": ');
    console.log(md5(fileContent));
  }
  
  reader4.onload = (event) => {
    const fileContent = event.target.result;
    console.log('Hash from "readAsDataURL()": ');
    console.log(md5(fileContent));
  }
  
  reader1.readAsText(file);
  reader2.readAsBinaryString(file);
  reader3.readAsArrayBuffer(file);
  reader4.readAsDataURL(file);
  
}
.myDiv {
  margin-bottom: 10px;
}
<script src="https://cdn.jsdelivr.net/npm/js-md5@0.7.3/src/md5.min.js"></script>
<div class="myDiv">Pick an image file to see the 4 hash results on console.log()</div>
<input type='file' onChange="onFileSelect(event)" accept='.jpg,.jpeg,.png,.gif' />

回答1:


Use readAsArrayBuffer.

readAsBinaryString() and readAsDataURL() will make your computer do a lot more work than what needs to be done:

  1. read the blob as binary stream
  2. convert to UTF-16 / base64 String (remember strings are not mutable in js, any operation you do on it will actually create a copy in memory)
  3. [ pass to your lib ]
  4. convert to binary string
  5. process the data

Also, it seems your library doesn't handle data URLs and fails on UTF-16 strings.

readAsText() by default will try to interpret you binary data as an UTF-8 text sequence, which is pretty bad for binary data like raster image:

// generate some binary data
document.createElement('canvas').toBlob(blob => {
  const utf8_reader = new FileReader();
  const bin_reader = new FileReader();
  let done = 0;
  utf8_reader.onload = bin_reader.onload = e => {
    if(++done===2) {
      console.log('same results: ', bin_reader.result === utf8_reader.result);
      console.log("utf8\n", utf8_reader.result);
      console.log("utf16\n", bin_reader.result);
    }
  }
  utf8_reader.readAsText(blob);
  bin_reader.readAsBinaryString(blob);
});

readAsArrayBuffer on the other hand will just allocate the binary data as is in memory. Simple I/O, no processing.
To manipulate this data, we can use TypedArrays views over this binary data, which being only views, won't create any overhead either.

And if you look at the library you are using, they will anyway pass your input to such an Uint8Array to further process it. However beware they apparently need you to pass an Uint8Array view of this ArrayBuffer instead of the nude ArrayBuffer directly.



来源:https://stackoverflow.com/questions/56498161/proper-way-to-read-a-file-using-filereader-to-generate-an-md5-hash-string-from

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!