How do I specify collation when creating a table from select in mysql?

20,073

Solution 1

CREATE TABLE my_table (
    ... none, some, or all columns ...
) ... none, some, or all specifications ...  -- Here
    SELECT ...;

If that fails, then

CREATE TABLE my_table ( all columns ) all specs;
INSERT INTO my_table
    SELECT ...;

Or...

CREATE ... SELECT ...;
ALTER TABLE my_table ...;  -- any fixes needed

Yet another approach is to make sure that the DATABASE has the DEFAULT CHARACTER SET and/or DEFAULT COLLATION desired before the CREATE TABLE ....

Whatever you do, follow it with

SHOW CREATE TABLE my_table;

to verify that each column has desired CHARACTER SET and COLLATION. If missing from the column, then look at the table's DEFAULTs.

Example

CREATE TABLE `us` (
  `id` int(10) unsigned NOT NULL DEFAULT '0',
  `country` char(2) CHARACTER SET ascii NOT NULL COMMENT 'ISO 3166 Country Code',
  `ascii_city` varchar(100) CHARACTER SET ascii NOT NULL COMMENT 'Name of city or town in ASCII encoding',
  `city` varchar(100) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  `state` char(2) CHARACTER SET ascii NOT NULL COMMENT 'For US, ISO-3166-2 code for the state/province name.',
  `population` int(10) unsigned NOT NULL DEFAULT '0',
  `lat` float NOT NULL,
  `lng` float NOT NULL,
  PRIMARY KEY (`id`),
  KEY `state` (`state`,`city`),
  KEY `state_2` (`state`,`population`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Notice the explicit CHARACTER SET and COLLATE on some columns, thereby overriding the DEFAULT CHARSET of the table.

When there is no collation, CHARSET utf8 defaults to COLLATE utf8_general_ci.

Example 2

In this, I will create a new table us2 from that table, but change the collation of city:

CREATE TABLE us2 (
  `city` varchar(100) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
)
    SELECT * from us;

Everything about us2 will be the same as for us except for the collation of that one column.

Solution 2

The following did work for me on MariaDB version 10:

CREATE TABLE my_table ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
SELECT *
FROM other_table
;
Share:
20,073
Streamline
Author by

Streamline

Updated on July 13, 2022

Comments

  • Streamline
    Streamline almost 2 years

    When creating a table with dynamically created and defined columns from a select result, how can I specify collation as you do when creating a table with column definitions?

    i.e.

    CREATE TABLE IF NOT EXISTS my_table (
      SELECT * FROM (
        SELECT ....
      ) )
    

    The above creates the table with the dynamic columns just fine but it is not using the collation I want. It is using utf8_general_ci and I want it to be utf8_unicode_ci.

    This below doesn't work

    CREATE TABLE IF NOT EXISTS my_table (
      SELECT * FROM (
        SELECT ....
      ) ) DEFAULT CHARSET=utf8 DEFAULT COLLATE utf8_unicode_ci;
    

    it gives error message indicating the commands are not valid at this position.

    I realize the above suggestion is not valid, so what method for achieving this is? I've tried setting the database collation thinking this was setting a default for the database for all tables created but it doesn't appear to behave like that - or maybe I am not actually setting the default for the database when I ALTER the database collation. Is there a way to set a default for tables so they the first code block above creates the table using the desired collation?

    The SELECT... results in the above code blocks refers to a variety of conditional/logical value setting using case/when statements, string parsing, and function results - it is not just selecting columns one-to-one from another table that has defined columns so it is not as simple as defining the other table's columns.

    I have already tried setting the database collation ahead of time using ALTER DATABASE my_db DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_unicode_ci but this doesn't change the results. The tables are still created (and the columns in the tables) using utf8_general_ci when creating a table from a select (dynamically defined columns from select results)