Delphi dbExpress and Interbase: UTF8 migration steps and risks?

﹥>﹥吖頭↗ 提交于 2019-11-27 18:51:50

问题


Currently, our database uses Win1252 as the only character encoding. We will have to support Unicode in the database tables soon, which means we have to perform this migration for four databases and around 80 Delphi applications which run in-house in a 24/7 environment. Are there recommendations for database migrations to UTF-8 (or UNICODE_FSS) for Delphi applications? Some questions listed below. Many thanks in advance for your answers!

  • are there tools which help with the migration of the existing databases (sizes between 250 MB and 2 GB, no Blob fields), by dumping the data, recreating the database with UNICODE_FSS or UTF-8, and loading the data back?
  • are there known problems with Delphi 2009, dbExpress and Interbase 7.5 related to Unicode character sets?
  • would you recommend to upgrade the databases to Interbase 2009 first? (This upgrade is planned but does not have a high priority)
  • can we simply migrate the database and Delphi will handle the Unicode character sets automatically, or will we have to change all character field types in every Datamodule (dfm and source code) too?
  • which strategy would you recommend to work on the migration in parallel with the normal development and maintenance of the existing application? The application runs in-house so development and database administration is done internally.

Update: From InterBase discussion forum thread: Unicode Databases in InterBase - Really? (it is not a thread by me but it shows that some problems still exist in InterBase XE).

Here are some reports I've submitted: QC#92867 - String fields are blank coming from Views only if the View includes a Union, and when using a ClientDataSet. This was found as missing data on a few of my reports, which no longer work.

QC#91494 - IB Character column data Character fields (eg: Char(1)) are padded with blanks when retrieved through a stored procedure. Tests fail - eg: If Active = "Y". I make heavy use of stored procedures with forms and these do not work.

QC#91355 - IBSqlMonitor fails. The output of IBSqlMonitor is somewhat garbled making this tool useless. (So, even my shovel is broken!)

Unreported - Persistent fields in TClientDataSet fail for TWideString.

Other related QC entries:

QC#94455 SQL Unicode Char Type Failure (InterBase XE)


回答1:


Both Database Workbench and IBExpert can do the data migration for you.

I'll get back to you on the other questions when I'm at the Entwickler Tage.

--jeroen




回答2:


Problem: UPDATE on a empty string field no longer finds a record. If a UTF8 character field is empty, the DataSetProvider generates a wrong SELECT for the update action.

Symptom: Message 'record not found or edited by another user'

Solution: upgrade to Delphi 2010 Update 4 or use the workaround described in QC




回答3:


Problem: CHAR fields no longer work and have to be replaced with VARCHAR.

Symptom: SELECT queries on a column which now uses UTF8 and is imported from WIN1252 with ASCII values no longer returns any value. Maybe this is a bug which I should report in QC.

Solution: replace all occurences of CHAR( in the database metadata DDL script with VARCHAR(




回答4:


Problem: persistent string fields require a Size property which is the logical size of the field multiplied by four (see also: Is it possible to tweak TStringField to work like TWideStringField in Delphi?)

Symptom: Access violations

Solution: delete the persistent field and add it again to update the Size property. (side effect: the DisplayWidth will also increase size, leading to problems with UI)




回答5:


Problem: UDF (user defined functions) with string parameters can break because of sizes limits.

Symptom:

Dynamic SQL Error.
SQL error code = -204.
Data type unknown.
Implementation limit exceeded.
COLUMN DSQL internal.

for this UDF:

DECLARE EXTERNAL FUNCTION STRLEN
    CSTRING(32767)
    RETURNS INTEGER BY VALUE
    ENTRY_POINT 'IB_UDF_strlen' MODULE_NAME 'ib_udf';

Solution: fix UDF parameters in the declaration.




回答6:


Problem: dbExpress uses WideString as data type internally, so all existing .AsString calls for reading / setting field and parameter will no longer work

Symptom: special characters will not be stored / read correctly

Solution: replace all occurences of .AsString with .AsWideString but be careful to not change where the AsString method is not called on a field or parameter.




回答7:


Problem: dbExpress needs TStringField objects for WIN1252 fields. For UTF8 database fields, dbExpress needs TWideStringField objects.

Symptom: error message 'expected: WideString found: string'

Solution: replace all occurences of TStringField with TWideStringField. This requires that all form files (dfm) are text, not binary. The modified forms and datamodules will not be backwards compatible.




回答8:


Problem: exporting metadata and table data for a WIN1252 database will create a CP1252 encoded file, but for the import, a UTF8 file is required (tested with IBExpert)

Symptom: errors in the script import to InterBase

Solution: use iconv to convert the script file to UTF8



来源:https://stackoverflow.com/questions/2302670/delphi-dbexpress-and-interbase-utf8-migration-steps-and-risks

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!