How to read a utf-8 encoded binary string in tensorflow?

后端 未结 1 582
广开言路
广开言路 2021-01-13 17:10

I am trying to convert an encoded byte string back into the original array in the tensorflow graph (using tensorflow operations) in order to make a prediction in a tensorflo

相关标签:
1条回答
  • 2021-01-13 17:58

    The answer that you referenced, is written assuming you are running the model on CloudML Engine's service. The service actually takes care of the JSON (including UTF-8) and base64 encoding.

    To get your code working locally or in another environment, you'll need the following changes:

    def array_request_example(input_array):
        input_array = input_array.astype(np.float32)
        return input_array.tostring()
    
    byte_string = tf.placeholder(dtype=tf.string)
    audio_samples = tf.decode_raw(byte_string, tf.float32)
    
    audio_array = np.array([1, 2, 3, 4])
    bstring = array_request_example(audio_array)
    fdict = {byte_string: bstring}
    with tf.Session() as sess:
        tf_samples = sess.run([audio_samples], feed_dict=fdict)
    

    That said, based on your code, I suspect you are looking to send data as JSON; you can use gcloud local predict to simulate CloudML Engine's service. Or, if you prefer to write your own code, perhaps something like this:

    def array_request_examples,(input_arrays):
      """input_arrays is a list (batch) of np_arrays)"""
      input_arrays = (a.astype(np.float32) for a in input_arrays)
      # Convert each image to byte strings
      bytes_strings = (a.tostring() for a in input_arrays)
      # Base64 encode the data
      encoded = (base64.b64encode(b) for b in bytes_strings)
      # Create a list of images suitable to send to the service as JSON:
      instances = [{'audio_bytes': {'b64': e}} for e in encoded]
      # Create a JSON request
      return json.dumps({'instances': instances})
    
    def parse_request(request):
      # non-TF to simulate the CloudML Service which does not expect
      # this to be in the submitted graphs.
      instances = json.loads(request)['instances']
      return [base64.b64decode(i['audio_bytes']['b64']) for i in instances]
    
    byte_strings = tf.placeholder(dtype=tf.string, shape=[None])
    decode = lambda raw_byte_str: tf.decode_raw(raw_byte_str, tf.float32)
    audio_samples = tf.map_fn(decode, byte_strings, dtype=tf.float32)
    
    audio_array = np.array([1, 2, 3, 4])
    request = array_request_examples([audio_array])
    fdict = {byte_strings: parse_request(request)}
    with tf.Session() as sess:
      tf_samples = sess.run([audio_samples], feed_dict=fdict)
    
    0 讨论(0)
提交回复
热议问题