Converting between strings and ArrayBuffers

后端 未结 24 806
慢半拍i
慢半拍i 2020-11-22 04:50

Is there a commonly accepted technique for efficiently converting JavaScript strings to ArrayBuffers and vice-versa? Specifically, I\'d like to be able to write the contents

相关标签:
24条回答
  • 2020-11-22 05:06

    See here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays/StringView (a C-like interface for strings based upon the JavaScript ArrayBuffer interface)

    0 讨论(0)
  • 2020-11-22 05:06

    For node.js and also for browsers using https://github.com/feross/buffer

    function ab2str(buf: Uint8Array) {
      return Buffer.from(buf).toString('base64');
    }
    function str2ab(str: string) {
      return new Uint8Array(Buffer.from(str, 'base64'))
    }
    

    Note: Solutions here didn't work for me. I need to support node.js and browsers and just serialize UInt8Array to a string. I could serialize it as a number[] but that occupies unnecessary space. With that solution I don't need to worry about encodings since it's base64. Just in case other people struggle with the same problem... My two cents

    0 讨论(0)
  • 2020-11-22 05:13

    Yes:

    const encstr = (`TextEncoder` in window) ? new TextEncoder().encode(str) : Uint8Array.from(str, c => c.codePointAt(0));
    
    0 讨论(0)
  • 2020-11-22 05:14

    All the following is about getting binary strings from array buffers

    I'd recommend not to use

    var binaryString = String.fromCharCode.apply(null, new Uint8Array(arrayBuffer));
    

    because it

    1. crashes on big buffers (somebody wrote about "magic" size of 246300 but I got Maximum call stack size exceeded error on 120000 bytes buffer (Chrome 29))
    2. it has really poor performance (see below)

    If you exactly need synchronous solution use something like

    var
      binaryString = '',
      bytes = new Uint8Array(arrayBuffer),
      length = bytes.length;
    for (var i = 0; i < length; i++) {
      binaryString += String.fromCharCode(bytes[i]);
    }
    

    it is as slow as the previous one but works correctly. It seems that at the moment of writing this there is no quite fast synchronous solution for that problem (all libraries mentioned in this topic uses the same approach for their synchronous features).

    But what I really recommend is using Blob + FileReader approach

    function readBinaryStringFromArrayBuffer (arrayBuffer, onSuccess, onFail) {
      var reader = new FileReader();
      reader.onload = function (event) {
        onSuccess(event.target.result);
      };
      reader.onerror = function (event) {
        onFail(event.target.error);
      };
      reader.readAsBinaryString(new Blob([ arrayBuffer ],
        { type: 'application/octet-stream' }));
    }
    

    the only disadvantage (not for all) is that it is asynchronous. And it is about 8-10 times faster then previous solutions! (Some details: synchronous solution on my environment took 950-1050 ms for 2.4Mb buffer but solution with FileReader had times about 100-120 ms for the same amount of data. And I have tested both synchronous solutions on 100Kb buffer and they have taken almost the same time, so loop is not much slower the using 'apply'.)

    BTW here: How to convert ArrayBuffer to and from String author compares two approaches like me and get completely opposite results (his test code is here) Why so different results? Probably because of his test string that is 1Kb long (he called it "veryLongStr"). My buffer was a really big JPEG image of size 2.4Mb.

    0 讨论(0)
  • 2020-11-22 05:14

    After playing with mangini's solution for converting from ArrayBuffer to String - ab2str (which is the most elegant and useful one I have found - thanks!), I had some issues when handling large arrays. More specefivally, calling String.fromCharCode.apply(null, new Uint16Array(buf)); throws an error:

    arguments array passed to Function.prototype.apply is too large.

    In order to solve it (bypass) I have decided to handle the input ArrayBuffer in chunks. So the modified solution is:

    function ab2str(buf) {
       var str = "";
       var ab = new Uint16Array(buf);
       var abLen = ab.length;
       var CHUNK_SIZE = Math.pow(2, 16);
       var offset, len, subab;
       for (offset = 0; offset < abLen; offset += CHUNK_SIZE) {
          len = Math.min(CHUNK_SIZE, abLen-offset);
          subab = ab.subarray(offset, offset+len);
          str += String.fromCharCode.apply(null, subab);
       }
       return str;
    }
    

    The chunk size is set to 2^16 because this was the size I have found to work in my development landscape. Setting a higher value caused the same error to reoccur. It can be altered by setting the CHUNK_SIZE variable to a different value. It is important to have an even number.

    Note on performance - I did not make any performance tests for this solution. However, since it is based on the previous solution, and can handle large arrays, I see no reason why not to use it.

    0 讨论(0)
  • 2020-11-22 05:15

    Update 2016 - five years on there are now new methods in the specs (see support below) to convert between strings and typed arrays using proper encoding.

    TextEncoder

    The TextEncoder represents:

    The TextEncoder interface represents an encoder for a specific method, that is a specific character encoding, like utf-8, iso-8859-2, koi8, cp1261, gbk, ... An encoder takes a stream of code points as input and emits a stream of bytes.

    Change note since the above was written: (ibid.)

    Note: Firefox, Chrome and Opera used to have support for encoding types other than utf-8 (such as utf-16, iso-8859-2, koi8, cp1261, and gbk). As of Firefox 48 [...], Chrome 54 [...] and Opera 41, no other encoding types are available other than utf-8, in order to match the spec.*

    *) Updated specs (W3) and here (whatwg).

    After creating an instance of the TextEncoder it will take a string and encode it using a given encoding parameter:

    if (!("TextEncoder" in window)) 
      alert("Sorry, this browser does not support TextEncoder...");
    
    var enc = new TextEncoder(); // always utf-8
    console.log(enc.encode("This is a string converted to a Uint8Array"));

    You then of course use the .buffer parameter on the resulting Uint8Array to convert the underlaying ArrayBuffer to a different view if needed.

    Just make sure that the characters in the string adhere to the encoding schema, for example, if you use characters outside the UTF-8 range in the example they will be encoded to two bytes instead of one.

    For general use you would use UTF-16 encoding for things like localStorage.

    TextDecoder

    Likewise, the opposite process uses the TextDecoder:

    The TextDecoder interface represents a decoder for a specific method, that is a specific character encoding, like utf-8, iso-8859-2, koi8, cp1261, gbk, ... A decoder takes a stream of bytes as input and emits a stream of code points.

    All available decoding types can be found here.

    if (!("TextDecoder" in window))
      alert("Sorry, this browser does not support TextDecoder...");
    
    var enc = new TextDecoder("utf-8");
    var arr = new Uint8Array([84,104,105,115,32,105,115,32,97,32,85,105,110,116,
                              56,65,114,114,97,121,32,99,111,110,118,101,114,116,
                              101,100,32,116,111,32,97,32,115,116,114,105,110,103]);
    console.log(enc.decode(arr));

    The MDN StringView library

    An alternative to these is to use the StringView library (licensed as lgpl-3.0) which goal is:

    • to create a C-like interface for strings (i.e., an array of character codes — an ArrayBufferView in JavaScript) based upon the JavaScript ArrayBuffer interface
    • to create a highly extensible library that anyone can extend by adding methods to the object StringView.prototype
    • to create a collection of methods for such string-like objects (since now: stringViews) which work strictly on arrays of numbers rather than on creating new immutable JavaScript strings
    • to work with Unicode encodings other than JavaScript's default UTF-16 DOMStrings

    giving much more flexibility. However, it would require us to link to or embed this library while TextEncoder/TextDecoder is being built-in in modern browsers.

    Support

    As of July/2018:

    TextEncoder (Experimental, On Standard Track)

     Chrome    | Edge      | Firefox   | IE        | Opera     | Safari
     ----------|-----------|-----------|-----------|-----------|-----------
         38    |     ?     |    19°    |     -     |     25    |     -
    
     Chrome/A  | Edge/mob  | Firefox/A | Opera/A   |Safari/iOS | Webview/A
     ----------|-----------|-----------|-----------|-----------|-----------
         38    |     ?     |    19°    |     ?     |     -     |     38
    
    °) 18: Firefox 18 implemented an earlier and slightly different version
    of the specification.
    
    WEB WORKER SUPPORT:
    
    Experimental, On Standard Track
    
     Chrome    | Edge      | Firefox   | IE        | Opera     | Safari
     ----------|-----------|-----------|-----------|-----------|-----------
         38    |     ?     |     20    |     -     |     25    |     -
    
     Chrome/A  | Edge/mob  | Firefox/A | Opera/A   |Safari/iOS | Webview/A
     ----------|-----------|-----------|-----------|-----------|-----------
         38    |     ?     |     20    |     ?     |     -     |     38
    
    Data from MDN - `npm i -g mdncomp` by epistemex
    
    0 讨论(0)
提交回复
热议问题