Converting between strings and ArrayBuffers

后端 未结 24 807
慢半拍i
慢半拍i 2020-11-22 04:50

Is there a commonly accepted technique for efficiently converting JavaScript strings to ArrayBuffers and vice-versa? Specifically, I\'d like to be able to write the contents

相关标签:
24条回答
  • 2020-11-22 05:27

    The following is a working Typescript implementation:

    bufferToString(buffer: ArrayBuffer): string {
        return String.fromCharCode.apply(null, Array.from(new Uint16Array(buffer)));
    }
    
    stringToBuffer(value: string): ArrayBuffer {
        let buffer = new ArrayBuffer(value.length * 2); // 2 bytes per char
        let view = new Uint16Array(buffer);
        for (let i = 0, length = value.length; i < length; i++) {
            view[i] = value.charCodeAt(i);
        }
        return buffer;
    }
    

    I've used this for numerous operations while working with crypto.subtle.

    0 讨论(0)
  • 2020-11-22 05:28

    In case you have binary data in a string (obtained from nodejs + readFile(..., 'binary'), or cypress + cy.fixture(..., 'binary'), etc), you can't use TextEncoder. It supports only utf8. Bytes with values >= 128 are each turned into 2 bytes.

    ES2015:

    a = Uint8Array.from(s, x => x.charCodeAt(0))
    

    Uint8Array(33) [2, 134, 140, 186, 82, 70, 108, 182, 233, 40, 143, 247, 29, 76, 245, 206, 29, 87, 48, 160, 78, 225, 242, 56, 236, 201, 80, 80, 152, 118, 92, 144, 48

    s = String.fromCharCode.apply(null, a)
    

    "ºRFl¶é(÷LõÎW0 Náò8ìÉPPv\0"

    0 讨论(0)
  • 2020-11-22 05:30

    Blob is much slower than String.fromCharCode(null,array);

    but that fails if the array buffer gets too big. The best solution I have found is to use String.fromCharCode(null,array); and split it up into operations that won't blow the stack, but are faster than a single char at a time.

    The best solution for large array buffer is:

    function arrayBufferToString(buffer){
    
        var bufView = new Uint16Array(buffer);
        var length = bufView.length;
        var result = '';
        var addition = Math.pow(2,16)-1;
    
        for(var i = 0;i<length;i+=addition){
    
            if(i + addition > length){
                addition = length - i;
            }
            result += String.fromCharCode.apply(null, bufView.subarray(i,i+addition));
        }
    
        return result;
    
    }
    

    I found this to be about 20 times faster than using blob. It also works for large strings of over 100mb.

    0 讨论(0)
  • 2020-11-22 05:32

    Based on the answer of gengkev, I created functions for both ways, because BlobBuilder can handle String and ArrayBuffer:

    function string2ArrayBuffer(string, callback) {
        var bb = new BlobBuilder();
        bb.append(string);
        var f = new FileReader();
        f.onload = function(e) {
            callback(e.target.result);
        }
        f.readAsArrayBuffer(bb.getBlob());
    }
    

    and

    function arrayBuffer2String(buf, callback) {
        var bb = new BlobBuilder();
        bb.append(buf);
        var f = new FileReader();
        f.onload = function(e) {
            callback(e.target.result)
        }
        f.readAsText(bb.getBlob());
    }
    

    A simple test:

    string2ArrayBuffer("abc",
        function (buf) {
            var uInt8 = new Uint8Array(buf);
            console.log(uInt8); // Returns `Uint8Array { 0=97, 1=98, 2=99}`
    
            arrayBuffer2String(buf, 
                function (string) {
                    console.log(string); // returns "abc"
                }
            )
        }
    )
    
    0 讨论(0)
  • 2020-11-22 05:32

    From emscripten:

    function stringToUTF8Array(str, outU8Array, outIdx, maxBytesToWrite) {
      if (!(maxBytesToWrite > 0)) return 0;
      var startIdx = outIdx;
      var endIdx = outIdx + maxBytesToWrite - 1;
      for (var i = 0; i < str.length; ++i) {
        var u = str.charCodeAt(i);
        if (u >= 55296 && u <= 57343) {
          var u1 = str.charCodeAt(++i);
          u = 65536 + ((u & 1023) << 10) | u1 & 1023
        }
        if (u <= 127) {
          if (outIdx >= endIdx) break;
          outU8Array[outIdx++] = u
        } else if (u <= 2047) {
          if (outIdx + 1 >= endIdx) break;
          outU8Array[outIdx++] = 192 | u >> 6;
          outU8Array[outIdx++] = 128 | u & 63
        } else if (u <= 65535) {
          if (outIdx + 2 >= endIdx) break;
          outU8Array[outIdx++] = 224 | u >> 12;
          outU8Array[outIdx++] = 128 | u >> 6 & 63;
          outU8Array[outIdx++] = 128 | u & 63
        } else {
          if (outIdx + 3 >= endIdx) break;
          outU8Array[outIdx++] = 240 | u >> 18;
          outU8Array[outIdx++] = 128 | u >> 12 & 63;
          outU8Array[outIdx++] = 128 | u >> 6 & 63;
          outU8Array[outIdx++] = 128 | u & 63
        }
      }
      outU8Array[outIdx] = 0;
      return outIdx - startIdx
    }
    

    Use like:

    stringToUTF8Array('abs', new Uint8Array(3), 0, 4);
    
    0 讨论(0)
  • 2020-11-22 05:33

    Unlike the solutions here, I needed to convert to/from UTF-8 data. For this purpose, I coded the following two functions, using the (un)escape/(en)decodeURIComponent trick. They're pretty wasteful of memory, allocating 9 times the length of the encoded utf8-string, though those should be recovered by gc. Just don't use them for 100mb text.

    function utf8AbFromStr(str) {
        var strUtf8 = unescape(encodeURIComponent(str));
        var ab = new Uint8Array(strUtf8.length);
        for (var i = 0; i < strUtf8.length; i++) {
            ab[i] = strUtf8.charCodeAt(i);
        }
        return ab;
    }
    
    function strFromUtf8Ab(ab) {
        return decodeURIComponent(escape(String.fromCharCode.apply(null, ab)));
    }
    

    Checking that it works:

    strFromUtf8Ab(utf8AbFromStr('latinкирилицаαβγδεζηあいうえお'))
    -> "latinкирилицаαβγδεζηあいうえお"
    
    0 讨论(0)
提交回复
热议问题