What is an appropriate data type to store a timezone?

19,986

Solution 1

Unfortunately PostgreSQL doesn't offer a time zone data type, so you should probably use text.

interval seems like a logical option at first glance, and it is appropriate for some uses. However, it fails to consider daylight savings time, nor does it consider the fact that different regions in the same UTC offset have different DST rules.

There is not a 1:1 mapping from UTC offset back to time zone.

For example, the time zone for Australia/Sydney (New South Wales) is UTC+10 (EST), or UTC+11 (EDT) during daylight savings time. Yes, that's the same acronym EST that the USA uses; time zone acronyms are non-unique in the tzdata database, which is why Pg has the timezone_abbreviations setting. Worse, Brisbane (Queensland) is at almost the same longditude and is in UTC+10 EST ... but doesn't have daylight savings, so sometime it's at a -1 offset to New South Wales during NSW's DST.

(Update: More recently Australia adopted an A prefix, so it uses AEST as its eastern states TZ acronym, but EST and WST remain in common use).

Confusing much?

If all you need to store is a UTC offset then an interval is appropriate. If you want to store a time zone, store it as text. It's a pain to validate and to convert to a time zone offset at the moment, but at least it copes with DST.

Solution 2

In an ideal world you could have a foreign key to a set of known timezones. You can do something close to this with views and domains.

This wiki tip by David E. Wheleer creates a domain that is tested for its validity as a timezone:

CREATE OR REPLACE FUNCTION is_timezone( tz TEXT ) RETURNS BOOLEAN as $$
BEGIN
 PERFORM now() AT TIME ZONE tz;
 RETURN TRUE;
EXCEPTION WHEN invalid_parameter_value THEN
 RETURN FALSE;
END;
$$ language plpgsql STABLE;

CREATE DOMAIN timezone AS CITEXT
CHECK ( is_timezone( value ) );

It's useful to have a list of known timezones, in which case you could dispense with the domain and just enforce the constraint in the one table containing the known timezone names (obtained from the view pg_timezone_names), avoiding the need to expose the domain elsewhere:

CREATE TABLE tzone
(
  tzone_name text PRIMARY KEY (tzone_name) CHECK (is_timezone(tzone_name))
);

INSERT INTO tzone (tzone_name)
SELECT name FROM pg_timezone_names;

Then you can enforce correctness through foreign keys:

CREATE TABLE myTable (
...
tzone TEXT REFERENCES tzone(tzone_name)
);

Solution 3

"+hh:mm" and "-hh:mm" are not time zones, they are UTC offsets. A good format to save those are as a signed integer with the offset in minutes. You can also use things like interval but that will only help you if you want to do date calculations directly in PostgreSQL, like in a query, etc. Usually though you do these calculations in another language, and then it depends on that language if it supports the interval type well and has a good date/time library or not. But converting an integer into some sort of interval-like type, like Pythons timedelta should be trivial, so I would personally just store it as an integer.

Time zones have names, and although there are no standardized names for the time zones there is one de facto standard in the "tz" or "zoneinfo" database, and that's names like "Europe/Paris", "Americas/New_York" or "US/Pacific". Those should be stored as strings.

Windows uses completely different names, such as "Romance time" (don't ask). You can store them as well as strings, but I would avoid it, these names aren't used outside Windows, and the names make no sense. Besides, translated versions of windows tend to use translated names for these timezones, making it even worse.

Abbreviations like "PDT" and "EST" are not usable as time zone names, because they are not unique. There is four (I think, or was it five?) different time zones all called "CST", so that's not usable.

In short: For time zones, store the name as a string. For UTC offsets, store the offset in minutes as a signed integer.

Solution 4

In postgres, you can already cast any TIMESTAMP or TIMESTAMPTZ to or from a named timezone, so you don't need to look up values from a table. You can use this expression directly in a check constraint, so you don't need to create a function for this either:

CREATE TABLE locations (
    location_id SERIAL PRIMARY KEY,
    name TEXT,
    timezone TEXT NOT NULL CHECK (now() AT TIME ZONE timezone IS NOT NULL)
);

If you try to insert a value that does not contain a valid timezone, you'll get an error that is actually rather user friendly:

INSERT INTO locations (name, timezone) VALUES ('foo', 'Adelaide/Australia');
ERROR:  time zone "Adelaide/Australia" not recognized

Depending upon your requirements, you might need the error to be in the format that a normal constraint violation would provide you, however in many cases this will suffice.

If you are using a web framework that provides you with a list of timezones that you can have in a dropdown box, then this validation should be sufficient, and then your check constraint is just a backup.

Share:
19,986
jl6
Author by

jl6

Updated on June 02, 2022

Comments

  • jl6
    jl6 about 2 years

    I'm thinking of simply using a string in the format "+hh:mm" (or "-hh:mm"). Is this both necessary and sufficient?

    Note: I don't need to store the date or the time, just the timezone.