How to store multi byte characters in SQL Server database using CodeIgniter

16,357

Solution 1

Looks like this answer is getting a lot of attention, and I feel bad for not posting the actual solution to my problem... I'd guess it's bad etiquette to de-select an answer I selected many years ago so I won't for now. Here goes...

No changes needed to be done to the settings. The problem is query related, and unfortunately CodeIgniter doesn't support the proper query format out of the box.

So when you want to insert multibyte characters into your table, you have to prepend the character N before your string.

So in my example above the query should look like this in order to work

INSERT INTO test_table (title) VALUES (N'Iñtërnâtiônàlizætiøn')

No, CI doesn't currently give you a built in way to do this. It is planed to be added in on CI4, but until then here is a hack for you

Solution 2

Try to convert your input with iconv() before insert to db :

$input = iconv('','UTF-8',$str);

Solution 3

Handling encoding in Microsoft's SQL Server from PHP can be quite painful. The CharacterSet-option was included with version 1.1 of Microsoft SQL Server Driver for PHP (SQLSRV), so there's an off-chance, you are using an outdated version that does not support setting the ChracterSet, although that is unlikely. Changing char_set to UTF-16 is not an option, as SQLSRV only supports UTF-8.

More likely one of the following applies:

  • in your php.ini the option default_charset is not set to UTF-8
  • as you probably are working on a Windows machine, your .php-file is not encoded in UTF-8.

If this does not solve the problem, then your input probably contains one ore more characters, which are not valid UTF-8. In this case try converting your (user) input with iconv().

edit: Regarding @Markus comment: CodeIgniter's system/database/drivers/sqlsrv/sqlsrv_driver.php looks like a simple wrapper around the sqlsrv-commands, it seems therefore unlikely, that the problem is caused by CodeIgniter-code.

Share:
16,357
Loupax
Author by

Loupax

Things I know and understand, but still they feel wrong console.log(NaN === NaN); console.log((true && false) == !(true || false))

Updated on June 05, 2022

Comments

  • Loupax
    Loupax almost 2 years

    I'm using MS SQL Server and CodeIgniter 2 with Active Record for a project I'm working on, and I just stumbled upon this issue:

    When I submit a form that contains Chinese or Hindi characters, I store it in a table, and when I view it all I get are question marks. If I try English or Greek characters, everything seems to work fine.

    The reason I believe this is something to do with the PHP I'm writing, is because if I copy-paste the chinese text directly in SQL Server Management Studio, all values are stored and displayed perfectly, both on the SQL Studio, and the web application.

    These are the db settings I'm using:

    $db['local']['dbdriver'] = 'sqlsrv';
    $db['local']['dbprefix'] = '';
    $db['local']['pconnect'] = FALSE;
    $db['local']['db_debug'] = TRUE;
    $db['local']['cache_on'] = FALSE;
    $db['local']['cachedir'] = '';
    $db['local']['char_set'] = 'utf8';
    $db['local']['dbcollat'] = 'utf8_general_ci';
    $db['local']['swap_pre'] = '';
    $db['local']['autoinit'] = TRUE;
    $db['local']['stricton'] = FALSE;
    

    This is the structure of the table I'm testing on right now:

    CREATE TABLE [dbo].[languages](
        [id] [int] IDENTITY(1,1) NOT NULL,
        [language] [nvarchar](1024) NULL,
        [language_local] [nvarchar](1024) NULL,
        [lang_code] [nvarchar](100) NULL,
        [core] [bit] NULL,
     CONSTRAINT [PK_languages] PRIMARY KEY CLUSTERED 
    (
        [id] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF,         ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    
    GO
    

    And this is my charset encoding in config.php

    $config['charset'] = 'utf-8';
    

    New troubleshooting data

    I tried to save the following string through my form: Iñtërnâtiônàlizætiøn

    CodeIgniter replied with this error:

    An error occurred translating the query string to UTF-16: No mapping for the Unicode character exists in the target multi-byte code page. .
    

    This doesn't appear when I try to store Chinese characters Thank you in advance :)

  • Loupax
    Loupax about 12 years
    These settings make me unable to connect to my database. The problem is that there is no way of setting manually the default collation of MSSQL... Not in the same way I can in MySQL at least...
  • Loupax
    Loupax about 12 years
    That indeed sounds like pain... In other words I'll have to check all my inputs if they contain UTF-16 characters, and if they do convert them to UTF-8 prior submitting to the database? Ouch... I'll try updating my sqlsrv driver and accept this answer once confirmed!
  • Diego Vieira
    Diego Vieira almost 10 years
    only converting using iconv as proposed by @Amin_Adha worked. Setting the default_charset makes no difference (I'm not using code igniter btw)
  • low_rents
    low_rents over 8 years
    this is bad practice. if you have setup everything correctly (database-interface, encoding of your files), then you NEVER EVER need iconv() or any other encoding-converting function in PHP.
  • Loupax
    Loupax almost 7 years
    You really should not change the user input when storing it into the database.