Switching from MySQL to PostgreSQL - tips, tricks and gotchas?

mysql postgresql migration load-data-infile

19,345

Solution 1

Just went through this myself, well I still am...

Case sensitive text
Lack of INSERT IGNORE and REPLACE
Explicit casting needed almost everywhere
No backticks
LOAD DATA INFILE (COPY is close, but not close enough)
Change autoincrement to SERIAL
Although bad form in MySQL, in Postgres, an INNER JOIN without an ON clause can't happen, use CROSS JOIN or the like
COUNT(*) can be crazy slow
Databases are encoded with character sets, not tables
You can have multiple databases, with multiple schemas (MySQL really has just one database and multiple schema)
Partitioning is different
MySQL interval vs. Postgres interval (for time intervals)
Implicit column renaming, Postgres requires AS
Cannot update multiple tables at the same time in Postgres
Postgres functions are powerful. So there is no CALL proc();; rewrite proc() as a function and SELECT proc();.

Solution 2

It is going to be a massive task as you'll have to test your entire code-base - every single query, anywhere, for

Syntax
Correct behaviour (i.e. returns the same results)
Performance - e.g. are there any performance regressions / improvements, and can you handle them?
Error handling - they do not behave the same under error conditions, maybe your code was relying on specific error codes

Operationally you will need to look at:

Backup/restore
Disc space utilisation
Memory utilisation
One-off data migration - could be a big / time consuming task
Rollback plan for if it fails
Monitoring - how are you monitoring your MySQL, and can those methods be adapted
(If relevant) - replication

You will definitely have to do major amounts of performance testing before considering such a move.

These costs make moving to a different database too expensive for most nontrivial apps. Consider the benefits VERY carefully against the vast, vast costs of doing all of the above.

I would be surprised if it takes you less than three months, in a nontrivial application, during which time you won't be able to continue regular development.

Solution 3

You could try PostgreSQL gotchas that contains the most common issues. Generally, the PostgreSQL documentation is pretty good too, so keep that under your pillow as well.

Also, Converting from MySQL to PostgreSQL on the pgsql wiki.

Solution 4

I found this script that will connect to your MySQL database and your PostgreSQL database and just transfer the contents. It worked like a charm for me.

https://github.com/philipsoutham/py-mysql2pgsql

Installed by

$ pip install py-mysql2pgsql

Run

$ py-mysql2pgsql

in any folder, and it will create a template settings file for you (mysql2pgsql.yml) that you can edit and enter your databases' details in.

I had to install argparse for it to work.

$ pip install argparse

When your database details are filled in, just run it again

$ py-mysql2pgsql

in the same folder as the settings file, and wham, you are done. It didn't print anything to the screen, but my database was fully copied afterwards.

Solution 5

Before converting, set your MySQL to ANSI-strictness by starting the server with: --transaction-isolation=SERIALIZABLE --sql-mode=ANSI

Make sure you're not using MyIsam tables.

MySQL allows a lot of conversions it shouldn't; pg will require a cast.

Your stored procs, functions, and triggers will have to be re-written. pg gives you a choice of languages for these, but you have to install the languages; it's not as user friendly as MySQL.

pg will only allow in a select list columns that are in a group by or are aggregates; MySQL will cheat by selecting the first value in the group if you do this.

MySQL adds a bunch of extensions: the not-equal operator can be != as in C, it allows '&&' as a synonym for 'and', '||' for 'or' etc. In particular, pg uses '||' to mean string catenation.

Basically, pg is pretty strictly ANSI, MySQL isn't. I'd strongly suggest getting your MySQL to as strict an ANSI compliance as possible before converting to pg, then checking for any warnings when you run your applications.

View more solutions

19,345

Author by

SarahK

Hands-on technical leader and developer with 20 years of experience in technology. Currently into Rust and nim. polyglot programmer. cloud architecture. scaling, distributed systems, & databases. typescript | node.js | javascript | ruby | go | etc. many buzzwords. enthusiastic drummer. aspirational powerlifter. occasionally hilarious (citation needed).

Updated on June 04, 2022

Comments

SarahK almost 2 years

I am contemplating a switch from MySQL to PostgreSQL.

What are your tips, tricks and gotchas for working with PostgreSQL?

What should a MySQLer look out for?

See also: How different is PostgreSQL to MySQL?
See also: Migrate from MySQL to PostgreSQL

Note - I don't think this is a duplicate. In particular the type of answers are quite diffferent and the responses here have much more implementation detail, which is what I was looking for
SarahK about 15 years

This is a new project, so there is no cost in migration at this stage. But this is all excellent advice for consideration.
rfusca almost 14 years

@OMG Ponies: in modern PG versions, you don't have to use sequences explicitly for auto ids. The serial and bigserial types hide the semantics of it all but underneath its still a sequences and all their power. They make an auto named sequence and set the default value of the column to nextval.
OMG Ponies almost 14 years

I come from Oracle, so I have a preference for sequences. When was the serial/bigserial functionality introduced - 8.x?
rfusca almost 14 years

@OMG Ponies: The manual has it as far back as 7.1. The serial/bigserial is really just a "magic" trick. After the initial creation, it shows up as a integer/bigint and a sequence.
Mr_Chimp over 13 years

Are you saying that Count(*) is slow but that, say, Count(id) would work reasonably fast? Or that PostgreSQL's Count() is just generally slow?
Admin about 12 years

I just discovered that i could not add any posts or comment in my forum (the software that uses the database), but it was resolved by exporting the postgresql database to a file, drop and create the database, and importing the file. Who knows, i might just have had the wrong user permissions for the user in my test-db where i imported it the first time..
user2508301 almost 11 years

@Mr_Chimp count() is slow in Postgres because it has to go out and look at all rows on disk due to MVCC. In 9.2 you can speed it up due to the ability to use index-only-scans in some cases, but fact is that usually count() will be slower than in MySQL. The trade-off, however, is that you gain that readers and writers do not wait on each other which can be a tremendous advantage on some workloads.
deFreitas about 6 years

Information about performance increase of count(*) here and here