Can I use VARCHAR as the PRIMARY KEY?

126,486

Solution 1

Of course you can, in the sense that your RDBMS will let you do it. The answer to a question of whether or not you should do it is different, though: in most situations, values that have a meaning outside your database system should not be chosen to be a primary key.

If you know that the value is unique in the system that you are modeling, it is appropriate to add a unique index or a unique constraint to your table. However, your primary key should generally be some "meaningless" value, such as an auto-incremented number or a GUID.

The rationale for this is simple: data entry errors and infrequent changes to things that appear non-changeable do happen. They become much harder to fix on values which are used as primary keys.

Solution 2

A blanket "no you shouldn't" is terrible advice. This is perfectly reasonable in many situations depending on your use case, workload, data entropy, hardware, etc.. What you shouldn't do is make assumptions.

It should be noted that you can specify a prefix which will limit MySQL's indexing, thereby giving you some help in narrowing down the results before scanning the rest. This may, however, become less useful over time as your prefix "fills up" and becomes less unique.

It's very simple to do, e.g.:

CREATE TABLE IF NOT EXISTS `foo` (
  `id` varchar(128),
  PRIMARY KEY (`id`(4))
)

Also note that the prefix (4) appears after the column quotes. Where the 4 means that it should use the first 4 characters of the 128 possible characters that can exist as the id.

Lastly, you should read how index prefixes work and their limitations before using them: https://dev.mysql.com/doc/refman/8.0/en/create-index.html

Solution 3

It depends on the specific use case.

If your table is static and only has a short list of values (and there is just a small chance that this would change during a lifetime of DB), I would recommend this construction:

CREATE TABLE Foo 
(
    FooCode VARCHAR(16), -- short code or shortcut, but with some meaning.
    Name NVARCHAR(128), -- full name of entity, can be used as fallback in case when your localization for some language doesn't exist
    LocalizationCode AS ('Foo.' + FooCode) -- This could be a code for your localization table... 
)

Of course, when your table is not static at all, using INT as primary key is the best solution.

Share:
126,486
Admin
Author by

Admin

Updated on September 12, 2020

Comments

  • Admin
    Admin almost 4 years

    I have a table for storing coupons/discounts, and I want to use the coupon_code column as the primary key, which is a VARCHAR.

    My rationale is that, each coupon will have a unique code, and the only commands I will be running are SELECT ... FROM ... WHERE coupon_code='..'

    I won't be doing any joins or indexing, and I don't see there ever being more than a few hundred entries in this table.

    It seems to me that this will be OK, but I don't know if there is anything I'm missing/not thinking about.

  • terary
    terary about 8 years
    I don't know. I think you should not change the way you uniquely identify your data. Suppose you uniquely identify employees by SSN (big no no), will you change an employee's ssn?
  • Sergey Kalinichenko
    Sergey Kalinichenko about 8 years
    @terary There is a difference between how you uniquely identify data as a user vs. how the database uniquely identifies the data. It is perfectly fine to have SSN in a unique field that is not a primary key. This would let you change SSN after it has been entered into the system, for example, because you discover a data entry error.
  • terary
    terary about 8 years
    @dasblinkenlight I see your point but I am not sure I agree. If there is an data entry error, DE clerk transposes the SSN. The error goes undetected for several months, paper work is submitted to government and others, the error is then detected --- The value should be changed? In this case using the primary 'constraints' is safe. The original record should remain intact, notes made, and a new user created. Thus creating an accurate logical paper trail.
  • Sergey Kalinichenko
    Sergey Kalinichenko about 8 years
    @terary There are multiple ways of dealing with this issue when tracking changes is important. For example, one can design audit trail capabilities with a temporal database pattern. Of course managing it through the process is a perfectly valid approach as well.
  • jchook
    jchook about 5 years
    Note that NDB cluster does not support index prefixes dev.mysql.com/doc/refman/8.0/en/…
  • KathyA.
    KathyA. over 4 years
    This answer makes WAY too many assumptions. There's no way this should be such a broad recommendation.
  • Jiulin Teng
    Jiulin Teng over 3 years
    The opinion here is misguided. I'd argue that in many situations it's advantageous to have a meaningful primary key or composite primary key.
  • Zero
    Zero over 3 years
    @terary so an error goes undetected for months and then is detected, you would keep the original record intact so that the person who's actual SSN it is now either cannot register their own SSN or has another person's information, data, or history linked to their SSN? Maybe you are thinking of keeping deleted records, but the way you describe here is completely unthought out.
  • html_programmer
    html_programmer almost 3 years
    Hm yeah I'm also not sure I agree. Use of natural key vs surrogate key would appear to be very circumstantial, not a one size fits all.