Handling of Unicode Characters using Delphi 6

后端 未结 2 1892
夕颜
夕颜 2021-01-05 18:34

I have a polling application developed in Delphi 6. It reads a file, parse the file according to specification, performs validation and uploads into database (SQL S

相关标签:
2条回答
  • 2021-01-05 18:41

    In non Unicode Delphi version, The basics are that you need to work with WideStrings (Unicode) instead of Strings (Ansi).

    Forget about TADOQuery.SQL (TStrings), and work with TADODataSet.CommandText or TADOCommand.CommandText(WideString) or typecast TADOQuery as TADODataSet. e.g:

    stlTemp: TWideStringList; // <- Unicode strings - TNT or other Unicode lib
    qry: TADOQuery;
    stQuery: WideString; // <- Unicode string
    
    TADODataSet(qry).CommandText := stQuery;
    RowsAffected := qry.ExecSQL;
    

    You can also use TADOConnection.Execute(stQuery) to execute queries directly.


    Be extra careful with Parametrized queries: ADODB.TParameters.ParseSQL is Ansi. If ParamCheck is true (by default) TADOCommand.SetCommandText->AssignCommandText will cause problems if your Query is Unicode (InitParameters is Ansi).

    (note that you can use ADO Command.Parameters directly - using ? chars as placeholder for the parameter instead of Delphi's convention :param_name).


    QuotedStr returns Ansi string. You need a Wide version of this function (TNT)


    Also, As @Arioch 'The mentioned TNT Unicode Controls suite is your best fried for making Delphi Unicode application. It has all the controls and classes you need to successfully manage Unicode tasks in your application.

    In short, you need to think Wide :)

    0 讨论(0)
  • 2021-01-05 18:41
    1. You did not specified database server, so this investigation remains on our part. You should check how does your database server support Unicode. That means how to specify Unicode charset for the database and the tables/column/indices/collations/etc inside it. You have to ensure that the whole DB is pervasively Unicode-enabled in every its detail, to avoid data loss.

    2. Generally you also should check that your database connection (using database access library of choice) also is unicode-enabled. Generally Microsoft ADO, just like and OLE, should be Unicode-enabled. But still check your database server manual how to specify unicode codepage or charset in the connection string. non-Unicode connection may also result in data loss.

    3. When you tell you read some unicode file - it is ambiguous. What ius unicode file ? Is it UTF-8 ? Or one of four flavours of UTF-16 ? Or UTF-7 ? Or some other Unicode Transportation Format ? Usual windows WideChar roughly corresponds to legacy UCS-2 and is expected be BOM-stripped Intel-Endian flavour of UTF-16. http://msdn.microsoft.com/en-us/library/windows/desktop/ms221069.aspx

    4. If the file is surely that flavour of UTF-16, then you can load it using Delphi TWideStringList or Jedi CodeLibrary TJclWideStringList. Review you code that you never work with your data using string variables - use WideString everywhere to avoid data loss.
      Since D6 was one of buggiest releases, i'd prefer to ensure EVERY update to Delphi is installed and then install and use JCL. JCL also provides codepage transition functions, that might be more flexible than plain AnsiStringVar := WideStringVar approach.
      For UTF-8 file, it can be loaded by TWideStringList class of JCL (but not TJclWideStringList).

    5. When debugging, load lines of the list to WideString variable and see that their content is preserved.

    6. Don't write queries like that. See http://bobby-tables.com/ Even if you do not expect malicious cracker - you can yourself make errors or meat unexpected data. Use parametrized queries, everywhere, every time! EVER!
      See the example of such: http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/delphivclwin32/ADODB_TADOQuery_Parameters.html
      Check that every SQL VARCHAR parameter would be ftWideString to contain Unicode, not ftString. Check the same about fields(columns).

    7. Think if legacy technologies can be casted aside since their support would only get harder in time.

      7.1. Since Microsoft ADO is deprecated (for exampel newer versions of Microsoft SQL Server would not support it), consider switching to 'live' data access libraries. Like AnyDAC, UniDAC, ZeosDB or some other library. Torry.net may hint you some.

      7.2. Since Delphi 6 RTL and VCL is not Unicode-ready, consider migrating your application to TNT Unicode Components, if you'd manage to find their free version or purchase them. Or migrating to newer Delphi releases.

      7.3. Since Delphi 6 is very old and long not-supported and since it was one of buggiest Delphi releases, consider migrating to newer Delphi versions or free tools like CodeTyphoon or Lazarus. As a bonus, Lazarus started moving to Unicode in its recent beta builds, and it is possible that by the end of migration to it you would get you application unicode-ready.

      7.4 Migration might be excuse and stimulus for re-factoring your application and getting rid of legacy spaghetti.

    0 讨论(0)
提交回复
热议问题