Delphi & Indy & utf8

懵懂的女人 提交于 2019-12-06 01:08:04

In Delphi 2009+, which includes XE6, string is a UTF-16 encoded UnicodeString.

You are using the overloaded version of TIdHTTP.Get() that returns a string. It decodes the sent text to UTF-16 using whatever charset is reported by the response. If the text is not decoding properly, it likely means the response is not reporting a correct charset. If the wrong charset is used, the text will not decode properly.

The URL in question is, in fact, sending a response Content-Type header that is set to application/json without specifying a charset at all. The default charset for application/json is UTF-8, but Indy does not know that, so it ends up using its own internal default instead, which is not UTF-8. That is why the text is not decoding properly when non-ASCII characters are present.

In which case, if you KNOW the charset will always be UTF-8, you have a few workarounds to choose from:

  • you can set Indy's default charset to UTF-8 by setting the global GIdDefaultTextEncoding variable in the IdGlobal unit:

    GIdDefaultTextEncoding := encUTF8;
    
  • you can use the TIdHTTP.OnHeadersAvailable event to change the TIdHTTP.Response.Charset property to 'utf-8' if it is blank or incorrect.

    Web.OnHeadersAvailable := CheckResponseCharset;
    
    ...
    
    procedure TMyClass.CheckResponseCharset(Sender: TObject; AHeaders: TIdHeaderList; var VContinue: Boolean);
    var
      Response: TIdHTTPResponse;
    begin
      Response := TIdHTTP(Sender).Response;
      if IsHeaderMediaType(Response.ContentType, 'application/json') and (Response.Charset = '') then
        Response.Charset := 'utf-8';
      VContinue := True;
    end;
    
  • you can use the other overloaded version of TIdHTTP.Get() that fills an output TStream instead of returning a string. Using a TMemoryStream or TStringStream, you can decode the raw bytes yourself using UTF-8:

    MStrm := TMemoryStream.Create;
    try
      Web.Get(Url, MStrm);
      MStrm.Position := 0;
      Sito := ReadStringFromStream(MStrm, IndyTextEncoding_UTF8);
    finally
      SStrm.Free;
    end;
    

    SStrm := TStringStream.Create('', TEncoding.UTF8);
    try
      Web.Get(Url, SStrm);
      Sito := SStrm.DataString;
    finally
      SStrm.Free;
    end;
    
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!