varchar(20) and varchar(50) are same?

28,089

Solution 1

MySQL offers a choice of storage engines. The physical storage of data depends on the storage engine.

MyISAM Storage of VARCHAR

In MyISAM, VARCHARs typically occupy just the actual length of the string plus a byte or two of length. This is made practical by the design limitation of MyISAM to table locking as opposed to a row locking capability. Performance consequences include a more compact cache profile, but also more complicated (slower) computation of record offsets.

(In fact, MyISAM gives you a degree of choice between fixed physical row size and variable physical row size table formats depending on column types occuring in the whole table. Occurrence of VARCHAR changes the default method only, but the presence of a TEXT blob forces VARCHARs in the same table to use the variable length method as well.)

The physical storage method is particularly important with indexes, which is a different story than tables. MyISAM uses space compression for both CHAR and VARCHAR columns, meaning that shorter data take up less space in the index in both cases.

InnoDB Storage of VARCHAR

InnoDB, like most other current relational databases, uses a more sophisticated mechanism. VARCHAR columns whose maximum width is less than 768 bytes will be stored inline, with room reserved matching that maximum width. More accurately here:

For each non-NULL variable-length field, the record header contains the length of the column in one or two bytes. Two bytes will only be needed if part of the column is stored externally in overflow pages or the maximum length exceeds 255 bytes and the actual length exceeds 127 bytes. For an externally stored column, the two-byte length indicates the length of the internally stored part plus the 20-byte pointer to the externally stored part. The internal part is 768 bytes, so the length is 768+20. The 20-byte pointer stores the true length of the column.

InnoDB currently does not do space compression in its indexes, the opposite of MyISAM as described above.

Back to the question

All of the above is however just an implementational detail that may even change between versions. The true difference between CHAR and VARCHAR is semantic, and so is the one between VARCHAR(20) and VARCHAR(50). By ensuring that there is no way to store a 30 character string in a VARCHAR(20), the database makes the life easier and better defined for various processors and applications that it supposedly integrates into a predictably behaving solution. This is the big deal.

Regarding personal names specifically, this question may give you some practical guidance. People with full names over 70 UTF-8 characters are in trouble anyway.

Solution 2

Yes, that is indeed the whole point of VARCHAR. It only takes up as much space as the text is long.

If you had CHAR(50), it would take up 50 bytes (or characters) no matter how short the data really is (it would be padded, usually by spaces).

Can Anybody tell me the reason?

Because people thought it was wasteful to store a lot of useless padding, they invented VARCHAR.

Solution 3

The manual states:

The CHAR and VARCHAR types are declared with a length that indicates the maximum number of characters you want to store. (...)

In contrast to CHAR, VARCHAR values are stored as a one-byte or two-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.

Notice that VARCHAR(255) is not the same as VARCHAR(256).

This is theory. As habeebperwad suggests, the actual footprint of one row depends on (engine) page size and (hard disk) block size.

Share:
28,089
Mohammed H
Author by

Mohammed H

Bye Stack Overflow!

Updated on July 13, 2022

Comments

  • Mohammed H
    Mohammed H almost 2 years

    I saw comment "If you have 50 million values between 10 and 15 characters in a varchar(20) column, and the same 50 million values in a varchar(50) column, they will take up exactly the same space. That's the whole point of varchar, as opposed to char.". Can Anybody tell me the reason? See What is a reasonable length limit on person "Name" fields?

  • Admin
    Admin almost 12 years
    It's actually a little more complicated than "useless padding": how to tell 'foo' from 'foo ' in a CHAR(4)?
  • hansvb
    hansvb almost 12 years
    True. Sort of. That may be important for some people. I always get a lot of downvotes when I bring this up (usually in the context of Oracle's decision to treat empty strings as NULL), but I question the application design that needs to differentiate between 'foo' and 'foo ' (as you can see from this comment thread, quotes can be a possible solution here, too, or you could pad with something else that is not otherwise used).
  • hansvb
    hansvb almost 12 years
    To bring up a positive about CHAR: it allows for fixed-length records. May be important for some special-purpose applications.
  • rabudde
    rabudde almost 12 years
    IMO the poster wants to know what's the difference between varchar(20) and varchar(50) and not "why was varchar inventend?"
  • Mohammed H
    Mohammed H almost 12 years
    @rabudde I thought there may by some data-block in mysql for varchar. That means MySql my allocate 32 bytes for all varchar less than 32 and it will allocate 64 bytes for all varchar less than 64 byte, etc. If so, varchar(33) and varchar(63) will be same.