decoding the JSON output from Microsoft translator API with PHP

前端 未结 2 555
耶瑟儿~
耶瑟儿~ 2021-01-13 23:39

this issue seems specific to microsofttranslator.com so please ... any answers, if you can test against it ...

Using the following URL for translat

相关标签:
2条回答
  • 2021-01-14 00:05

    The API is returning a wrong byte order mark (BOM).
    The string data itself is UTF-8 but is prepended with U+FEFF which is a UTF-16 BOM. Just strip out the first two bytes and json_decode.

    ...
    $output = curl_exec($ch);
    // Insert some sanity checks here... then,
    $output = substr($output, 3);
    ...
    $decoded = json_decode($output, true);
    

    Here's the entirety of my test code.

    $texts = array("i am the best" => 0, "you are the best" => 0);
    $ch = curl_init(); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    $data = array(
        'appId' => $bing_appId,
        'from' => 'en',
        'to' => 'zh-CHS',
        'texts' => json_encode(array_keys($texts))
        );
    curl_setopt($ch, CURLOPT_URL, $bingArrayUrl . '?' . http_build_query($data)); 
    $output = curl_exec($ch);
    $output = substr($output, 3);
    print_r(json_decode($output, true));
    

    Which gives me

    Array
    (
        [0] => Array
            (
                [From] => en
                [OriginalTextSentenceLengths] => Array
                    (
                        [0] => 13
                    )
    
                [TranslatedText] => 我是最好的
                [TranslatedTextSentenceLengths] => Array
                    (
                        [0] => 5
                    )
    
            )
    
        [1] => Array
            (
                [From] => en
                [OriginalTextSentenceLengths] => Array
                    (
                        [0] => 16
                    )
    
                [TranslatedText] => 你是最好的
                [TranslatedTextSentenceLengths] => Array
                    (
                        [0] => 5
                    )
    
            )
    
    )
    

    Wikipedia entry on BOM

    0 讨论(0)
  • 2021-01-14 00:07

    There is nothing syntactically wrong with your JSON string. It is possible that the json is coming back with characters outside the UTF-8 byte range, but this would cause json_decode() to throw an exception indicating that.

    Test Code:

    ini_set("track_errors", 1);
    
    $json = '
     [
          {
               "From":"en",
               "OriginalTextSentenceLengths":[13],
               "TranslatedText":"我是最好的",
               "TranslatedTextSentenceLengths":[5]
          },
          {
               "From":"en",
               "OriginalTextSentenceLengths":[16],
               "TranslatedText":"你是最好的",
               "TranslatedTextSentenceLengths":[5]
          }
     ]
    ';
    
    $out = @json_decode($json, TRUE);
    
    if(!$out) {
            throw new Exception("$php_errormsg\n");
    } else {
            print_r($out);
    }
    
    ?>
    

    Output:

    $ php -f jsontest.php 
    Array
    (
        [0] => Array
            (
                [From] => en
                [OriginalTextSentenceLengths] => Array
                    (
                        [0] => 13
                    )                                                                                                                                                                   
    
                [TranslatedText] => 我是最好的                                                                                                                                          
                [TranslatedTextSentenceLengths] => Array                                                                                                                                
                    (                                                                                                                                                                   
                        [0] => 5                                                                                                                                                        
                    )                                                                                                                                                                   
    
            )                                                                                                                                                                           
    
        [1] => Array                                                                                                                                                                    
            (                                                                                                                                                                           
                [From] => en                                                                                                                                                            
                [OriginalTextSentenceLengths] => Array                                                                                                                                  
                    (                                                                                                                                                                   
                        [0] => 16                                                                                                                                                       
                    )                                                                                                                                                                   
    
                [TranslatedText] => 你是最好的                                                                                                                                          
                [TranslatedTextSentenceLengths] => Array                                                                                                                                
                    (                                                                                                                                                                   
                        [0] => 5                                                                                                                                                        
                    )                                                                                                                                                                   
    
            )                                                                                                                                                                           
    
    )
    
    0 讨论(0)
提交回复
热议问题