Delphi 7 - Why does Windows 7 change encoding of characters in runtime?

11,922

Solution 1

Answers to this question solve my problem:

GetThreadLocale returns different value than GetUserDefaultLCID?

One solution:

The strange thing we found is that switching to a different regional settings via Control Panel and then switching back to NZ resolves the issue. I'd be curious to know if the same workaround resolves it for you just to verify that we're seeing the same phenomenon.

and second:

initialization
  SetThreadLocale(LOCALE_USER_DEFAULT);
  GetFormatSettings;

Both solutions work great and problem with application disappears.

Solution 2

I did reproduce the behavior in Delphi 2010 in win XP.

procedure Button1Click(Sender : TObject);
begin
  ShowMessage(AnsiString(Label1.Caption));
end;

In this situation, the conversion of Label1.Caption to AnsiString is done through WideCharToMultiByte (Windows API).

The API has the following note :

The ANSI code pages can be different on different computers, or can be changed for a single computer, leading to data corruption. For the most consistent results, applications should use Unicode, such as UTF-8 or UTF-16, instead of a specific code page, unless legacy standards or data formats prevent the use of Unicode. If using Unicode is not possible, applications should tag the data stream with the appropriate encoding name when protocols allow it. HTML and XML files allow tagging, but text files do not.

So, my best guess is that the difference in behavior come from the fact that the version of Windows 7 you have has a different active CodePage than on your vista/XP stations.

I still have to find how to get the active codepage on a system... My best guess is that it is defined in the regional settings in the control panel. But I still need to verify this...

Solution 3

You ran into what I consider a "bug" in the TWriter.WriteString and TWriter.ReadString methods. Those two methods are internally used by Delphi to move your TLabel.Caption from the actual live object at design time into the DFM file and then back into the live object at run time.

If you look at the code for the mentioned two routines, you'll notice (in shock I assume) that the actual stuff that goes into the stream is converted to Unicode using the operating system's default code page! That's fine and dandy as long as the code page used on the development machine exactly matches the code page used on the test machine, and they probably don't match, and that's most likely why you get the error. Please note the EASTEUROPEAN_CHARSET you're setting for the Caption on the form has absolutely no value, because the TWriter.WriteString method has no idea about it!

I've got an bug report on this issue on QC, it's been there for many years... They probably think it's "by design", but I don't think it's an very good design.

The solution I'd recomand is a fast switch to Delphi 2010. I'm an Delphi developer in Romania, and I've had lots and lots of problem with this kind of stuff, but now it's all in the past because Delphi 2010 is UNICODE so I no longer need to worry about code page conversions.

If you can't switch to Delphi 2010 you might want to "hack" the Classes.pas file and change the TReader.ReadString routine to always do the conversion using YOUR code page, not the system default.

Share:
11,922
LukLed
Author by

LukLed

My name is Łukasz Ledóchowski. I am almost on daily basis with: ASP.NET MVC / Entity Framework / KendoUI ASP.NET WebApi / Angular.JS / CQRS PHP / Kohana / MySQL

Updated on June 05, 2022

Comments

  • LukLed
    LukLed over 1 year

    I have a delphi 7 form:

    Form

    and my code:

    Code

    when I run this form in Windows 7, I see:

    Windows7Form

    In design time, form had polish letters in first label, but it doesn't have them in runtime. It looks ok on Vista or Windows XP. When I set caption of second label in code, everything works fine and characters are properly encoded.

    First 5 codes of top label on Windows 7: 65 97 69 101 83

    First 5 codes of top label on Windows Vista/XP: 165 185 202 234 140

    First 5 codes of bottom label on every system: 165 185 202 234 140

    Windows 7 changes encoding, why? My system settings seem to be ok. I have proper language set for non-unicode applications in control panel.

    EDIT

    This problem is not only related with labels on forms, but also with FastReport (where switching to EASTERN_CHARSET resolves the problem) or with accesing Microsoft Excel through COM interface.