Convert a numpy.ndarray to string(or bytes) and convert it back to numpy.ndarray

前端 未结 7 625
北恋
北恋 2020-12-08 09:52

I\'m having a little trouble here,

I\'m trying to convert a numpy.ndarray to string, I\'ve already done that like this:

randomArray.tostring()


        
相关标签:
7条回答
  • 2020-12-08 10:11

    If you use tostring you lose information on both shape and data type:

    >>> import numpy as np
    >>> a = np.arange(12).reshape(3, 4)
    >>> a
    array([[ 0,  1,  2,  3],
           [ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    >>> s = a.tostring()
    >>> aa = np.fromstring(a)
    >>> aa
    array([  0.00000000e+000,   4.94065646e-324,   9.88131292e-324,
             1.48219694e-323,   1.97626258e-323,   2.47032823e-323,
             2.96439388e-323,   3.45845952e-323,   3.95252517e-323,
             4.44659081e-323,   4.94065646e-323,   5.43472210e-323])
    >>> aa = np.fromstring(a, dtype=int)
    >>> aa
    array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
    >>> aa = np.fromstring(a, dtype=int).reshape(3, 4)
    >>> aa
    array([[ 0,  1,  2,  3],
           [ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    

    This means you have to send the metadata along with the data to the recipient. To exchange auto-consistent objects, try cPickle:

    >>> import cPickle
    >>> s = cPickle.dumps(a)
    >>> cPickle.loads(s)
    array([[ 0,  1,  2,  3],
           [ 4,  5,  6,  7],
           [ 8,  9, 10, 11]])
    
    0 讨论(0)
  • 2020-12-08 10:12

    Imagine you have a numpy array of integers (it works with other types but you need some slight modification). You can do this:

    a = np.array([0, 3, 5])
    a_str = ','.join(str(x) for x in a) # '0,3,5'
    a2 = np.array([int(x) for x in a_str.split(',')]) # np.array([0, 3, 5])
    

    If you have an array of float, be sure to replace int by float in the last line.

    You can also use the __repr__() method, which will have the advantage to work for multi-dimensional arrays:

    from numpy import array
    numpy.set_printoptions(threshold=numpy.nan)
    a = array([[0,3,5],[2,3,4]])
    a_str = a.__repr__() # 'array([[0, 3, 5],\n       [2, 3, 4]])'
    a2 = eval(a_str) # array([[0, 3, 5],
                     #        [2, 3, 4]])
    
    0 讨论(0)
  • 2020-12-08 10:13

    You can use the fromstring() method for this:

    arr = np.array([1, 2, 3, 4, 5, 6])
    ts = arr.tostring()
    print(np.fromstring(ts, dtype=int))
    
    >>> [1 2 3 4 5 6]
    

    Sorry for the short answer, not enough points for commenting. Remember to state the data types or you'll end up in a world of pain.

    Note on fromstring from numpy 1.14 onwards:

    sep : str, optional

    The string separating numbers in the data; extra whitespace between elements is also ignored.

    Deprecated since version 1.14: Passing sep='', the default, is deprecated since it will trigger the deprecated binary mode of this function. This mode interprets string as binary bytes, rather than ASCII text with decimal numbers, an operation which is better spelt frombuffer(string, dtype, count). If string contains unicode text, the binary mode of fromstring will first encode it into bytes using either utf-8 (python 3) or the default encoding (python 2), neither of which produce sane results.

    0 讨论(0)
  • 2020-12-08 10:17

    This is a fast way to encode the array, the array shape and the array dtype:

    def numpy_to_bytes(arr: np.array) -> str:
        arr_dtype = bytearray(str(arr.dtype), 'utf-8')
        arr_shape = bytearray(','.join([str(a) for a in arr.shape]), 'utf-8')
        sep = bytearray('|', 'utf-8')
        arr_bytes = arr.ravel().tobytes()
        to_return = arr_dtype + sep + arr_shape + sep + arr_bytes
        return to_return
    
    def bytes_to_numpy(serialized_arr: str) -> np.array:
        sep = '|'.encode('utf-8')
        i_0 = serialized_arr.find(sep)
        i_1 = serialized_arr.find(sep, i_0 + 1)
        arr_dtype = serialized_arr[:i_0].decode('utf-8')
        arr_shape = tuple([int(a) for a in serialized_arr[i_0 + 1:i_1].decode('utf-8').split(',')])
        arr_str = serialized_arr[i_1 + 1:]
        arr = np.frombuffer(arr_str, dtype = arr_dtype).reshape(arr_shape)
        return arr
    

    To use the functions:

    a = np.ones((23, 23), dtype = 'int')
    a_b = numpy_to_bytes(a)
    a1 = bytes_to_numpy(a_b)
    np.array_equal(a, a1) and a.shape == a1.shape and a.dtype == a1.dtype
    
    0 讨论(0)
  • 2020-12-08 10:17

    Imagine you have a numpy array of text like in a messenger

     >>> stex[40]
     array(['Know the famous thing ...
    

    and you want to get statistics from the corpus (text col=11) you first must get the values from dataframe (df5) and then join all records together in one single corpus:

     >>> stex = (df5.ix[0:,[11]]).values
     >>> a_str = ','.join(str(x) for x in stex)
     >>> a_str = a_str.split()
     >>> fd2 = nltk.FreqDist(a_str)
     >>> fd2.most_common(50)
    
    0 讨论(0)
  • 2020-12-08 10:31

    I know, I am late but here is the correct way of doing it. using base64. This technique will convert the array to string.

    import base64
    import numpy as np
    random_array = np.random.randn(32,32)
    string_repr = base64.binascii.b2a_base64(random_array).decode("ascii")
    array = np.frombuffer(base64.binascii.a2b_base64(string_repr.encode("ascii"))) 
    

    For array to string

    Convert binary data to a line of ASCII characters in base64 coding and decode to ASCII to get string repr.

    For string to array

    First, encode the string in ASCII format then Convert a block of base64 data back to binary and return the binary data.

    0 讨论(0)
提交回复
热议问题