Why is 1 Byte equal to 8 Bits?

byte bit

34,462

I'ts been a minute since I took computer organization, but the relevant wiki on 'Byte' gives some context.

The byte was originally the smallest number of bits that could hold a single character (I assume standard ASCII). We still use ASCII standard, so 8 bits per character is still relevant. This sentence, for instance, is 41 bytes. That's easily countable and practical for our purposes.

If we had only 4 bits, there would only be 16 (2^4) possible characters, unless we used 2 bytes to represent a single character, which is more inefficient computationally. If we had 16 bits in a byte, we would have a whole lot more 'dead space' in our instruction set, we would allow 65,536 (2^16) possible characters, which would make computers run less efficiently when performing byte-level instructions, especially since our character set is much smaller.

Additionally, a byte can represent 2 nibbles. Each nibble is 4 bits, which is the smallest number of bits that can encode any numeric digit from 0 to 9 (10 different digits).

34,462

Author by

aerin

Senior Research Engineer @Microsoft Answering questions on applied math, general algorithmic questions, nlp, machine learning, deep learning, etc. I write on automata88.medium.com

Updated on July 31, 2022

Comments

aerin almost 2 years

Why not 4 bits, or 16 bits?

I assume some hardware-related reasons and I'd like to know how 8bit 1byte became the standard.
Bango over 7 years

Correction, ASCII uses 7 bits.
Tom Blodget over 7 years

Except "this sentence" isn't encoded in ASCII. It's encoded in UTF-8. ASCII has very limited and specialized usages. UTF-8 is an encoding for the Unicode character set. All text in HTML, XML, … is Unicode. See the HTTP response header for this page to see that the web server encoded it in UTF-8. (Hit F12, then F5, then select the request name 42842817.) If you consult the HTTP specification, you'll find that the HTTP headers are in fact ASCII. So we do use ASCII every day but we hardly ever use in new progams.
Bango over 7 years

Is that why they call it UTF-8? Because its Using The Full 8 bit byte? haha
Tom Blodget over 7 years

No. It's called UTF-8 because the code unit is 8 bits. Each code unit provides some of the bits needed for the 21-bit Unicode codepoint. A codepoint requires 1 to 4 UTF-8 code units. Similarly for UTF-16 and UTF-32. However, by design, a codepoint would never need more than one UTF-32 code unit.
Pradeep Gollakota about 4 years

@Tom Blodget You are technically right that the encoding is UTF-8. But that's meaningless in this context because UTF-8 is a superset of ASCII.
Farshid Ahmadi over 3 years

and some thing else... why not 7 or 9 bit ? Because the number 8 is one of the powers of the number 2
Amrit Prasad over 3 years

ASCII uses 7 bits.. Why 8 bit is used instead of 7 bit?
Bango over 3 years

@Hitman stackoverflow.com/questions/14690159/…
Jerry An almost 3 years

ASCII is a 7-bit code, representing 128 different characters. When an ASCII character is stored in a byte the most significant bit is always zero. Sometimes the extra bit is used to indicate that the byte is not an ASCII character, but is a graphics symbol, however this is not defined by ASCII.