Why does a byte only have 0 to 255?

71,161

Solution 1

Strictly speaking, the term "byte" can actually refer to a unit with other than 256 values. It's just that that's the almost universal size. From Wikipedia:

Historically, a byte was the number of bits used to encode a single character of text in a computer and it is for this reason the basic addressable element in many computer architectures.

The size of the byte has historically been hardware dependent and no definitive standards exist that mandate the size. The de facto standard of eight bits is a convenient power of two permitting the values 0 through 255 for one byte. Many types of applications use variables representable in eight or fewer bits, and processor designers optimize for this common usage. The popularity of major commercial computing architectures have aided in the ubiquitous acceptance of the 8-bit size. The term octet was defined to explicitly denote a sequence of 8 bits because of the ambiguity associated with the term byte.

Ironically, these days the size of "a single character" is no longer consider a single byte in most cases... most commonly, the idea of a "character" is associated with Unicode, where characters can be represented in a number of different formats, but are typically either 16 bits or 32.

It would be amusing for a system which used UCS-4/UTF-32 (the direct 32-bit representation of Unicode) to designate 32 bits as a byte. The confusion caused would be spectacular.

However, assuming we take "byte" as synonymous with "octet", there are eight independent bits, each of which can be either on or off, true or false, 1 or 0, however you wish to think of it. That leads to 256 possible values, which are typically numbered 0 to 255. (That's not always the case though. For example, the designers of Java unfortunately decided to treat bytes as signed integers in the range -128 to 127.)

Solution 2

Because a byte, by its standard definition, is 8 bits which can represent 256 values (0 through 255).

Solution 3

Byte ≠ Octet

Why does a byte only range from 0 to 255?

It doesn’t.

An octet has 8 bits, thus allowing for 28 possibilities. A byte is ill‐defined. One should not equate the two terms, as they are not completely interchangeable. Also, wicked programming languages that support only signed characters (ʏᴏᴜ ᴋɴᴏw ᴡʜᴏ ʏᴏᴜ ᴀʀᴇ﹗) can only represent the values −128 to 127, not 0 to 255.

Big Iron takes a long time to rust.

Most but not all modern machines all have 8‑bits bytes, but that is a relatively recent phenomenon. It certainly has not always been that way. Many very early computers had 4‑bit bytes, and 6‑bit bytes were once common even comparitively recently. Both of those types of bytes hold rather fewer values than 255.

Those 6‑bit bytes could be quite convenient, since with a word size of 36 bits, six such bytes fit cleanly into one of those 36‑bit words without any jiggering. That made if very useful for holding Fieldata, used by the very popular Sperry ᴜɴɪᴠᴀᴄ computers. You can only fit 4 ᴀsᴄɪɪ characters into a 36‑bit word, not 6 Fieldata. We had 1100 series at the computing center when I was an undergraduate, but this remains true even with the modern 2200 series.

Enter ASCII

ᴀsᴄɪɪ — which was and is only a 7‑ not an 8‑bit code — paved the way for breaking out of that world. The importance of the ɪʙᴍ 360, which had 8‑bit bytes whether they held ᴀsᴄɪɪ or not, should not be understated.

Nevertheless, many machines long supported ᴅᴇᴄ’s Radix‑50. This was a 40‑character repertoire wherein three of its characters could be efficiently packed into a single 16‑bit words under two distinct encoding schemes. I used plenty of ᴅᴇᴄ ᴘᴅᴘ‑11s and Vaxen during my university days, and Rad‑50 was simply a fact of life, a reality that had to be accomodated.

Solution 4

A Byte has 8 bits(8 1's or 0's) 01000111=71

each bit represents a value, 1,2,4,8,16,32,64,128 but from right to left ?

example

128, 64, 32, 16, 8, 4, 2, 1,
0    1   0   0   0  1  1  1 =71
1    1   1   1   1  1  1  1 = max 255
0    0   0   0   0  0  0  0 = min 0

using binary 1's or 0's and only 8 bits(1 byte) we can only have

1 of each value 1 X 128, 1 X 64,1 X 32 etc giving a max total of 255 and a min of 0

Solution 5

You are wrong! A byte ranges from 0 to 63 or from 0 to 99!

Do you believe in God? God said in the Holy Bible.

The basic unit of information is a byte. Each byte contains an unspecified amount of information, but it must be capable of holding at least 64 distinct values. That is, we know that any number between 0 and 63, inclusive, can be contained in one byte. Furthermore, each byte contains at most 100 distinct values. On a binary computer a byte must therefore be composed of six bits; on a decimal computer we have two digits per byte.* - The Art of Computer Programming, Volume 1, written by Donald Knuth.

And...

* Since 1975 or so, the word "byte" has come to mean a sequence of precisely eight binary digits, capable of representing the numbers 0 to 255. Real-world bytes are therefore larger than the bytes of the hypothetical MIX machine; indeed, MIX's old-style bytes are just barely bigger than nybbles. When we speak of bytes in connection with MIX we shall confine ourselves to the former sense of the word, harking back to the days when bytes were not yet standardized. - The Art of Computer Programming, Volume 1, written by Donald Knuth.

:-)

Share:
71,161
Strawberry
Author by

Strawberry

I want to learn industry practices and apply them to my projects to make myself successful

Updated on December 15, 2020

Comments

  • Strawberry
    Strawberry over 3 years

    Why does a byte only range from 0 to 255?

  • Pablo Santa Cruz
    Pablo Santa Cruz over 13 years
    Oh! Wait. Jon Skeet is here. May be not. :-)
  • dan04
    dan04 over 13 years
    Too bad C chose to use char for the byte type, which now means that a char is not a character.
  • tchrist
    tchrist over 13 years
    @Jon: I should hardly say that Unicode (a 21‐bit character set) is typically represented by 16 or 32 bits! That’s an extremely Java/Microsoft‐centric point of view! First of all, nothing but stupid old UCS‐2 is only 16 bits. And while it is true that UTF‐16 serializes to either 16 or 32 bits, far and beyond the most common encoding scheme bar none for Unicode text is certainly UTF‐8. Anyone thinking about the ‘size of a character’ has stopped thinking about abstract size‐independent characters, and is a perilous path at best.
  • tchrist
    tchrist over 13 years
    Also, it’s not “a unit with more than 256 values”, but rather one with other than 256 values. That’s because there were (and sometimes still are) a lot of machines whose bytes held fewer than 8 bits, not more.
  • tchrist
    tchrist over 13 years
    @dan04: That’s no worse than Java, whose char and even Character data type cannot hold a character. That’s because they screwed up the notion of an abstract character, confusing high-level characters with low-level serialization schemes. Then to add insult to injury, Java also cursed people with the ugliest of all possible serialized representations to be forever conscious of or be plagued by error. It’s a real mess!
  • dan04
    dan04 over 13 years
    You can fit 5 ASCII characters into a 36-bit word, if you can figure out what to do with the 1 bit left over.
  • tchrist
    tchrist over 13 years
    @Dan04: Nice. :) I meant without packing, but of course you are correct. It really sucks on machine where you take a hit on non-word addressable stuff.
  • dan04
    dan04 over 13 years
    And while your answer is technically correct, nowadays 6-bit and 9-bit bytes are used more for "Byte ≠ Octet" pedantry than they are for actual programming.
  • tchrist
    tchrist over 13 years
    @dan04: Are you saying that technical corrrectness should count for nothing in a technical forum? What would you prefer in its stead?
  • dan04
    dan04 over 13 years
    "The $LANGUAGE standard doesn't precisely define $TERM, but nearly all implementations use $DE_FACTO_STANDARD, which you can safely assume unless you're writing for $OBSCURE_PLATFORM."
  • Andrew Shepherd
    Andrew Shepherd over 13 years
    I've previously gotten downvoted for making that assumption: stackoverflow.com/questions/2069488/multiply-without-operato‌​r/…
  • Jon Skeet
    Jon Skeet over 13 years
    @tchrist: Changed "more" to "other". As for the UTF-16/UTF-32/UTF-8 issue, my point is that even UTF-8 is still representing up to a 32-bit (31-bit?) number (albeit only 21 bits being used at the moment). I was thinking of Unicode as a coded character set (initially 16-bit, now 21 or 31 bit) rather than in terms of a character encoding form.
  • subrat71
    subrat71 over 13 years
    The Common Lisp language has functions called ldb and dpb, with the Hyperspec documentation attributing the names for both to the PDP-10 assembly language: lispworks.com/documentation/HyperSpec/Body/f_dpb.htm.
  • Jörg W Mittag
    Jörg W Mittag over 13 years
    @dan04: I wouldn't be surprised if you had a processor with non-8-bit bytes in your pocket right now. The DSPs that are used in the radio portions of modern mobile phones are often pretty weird.
  • Jörg W Mittag
    Jörg W Mittag over 13 years
    @tchrist: Indeed. As of Java 5, int is the new char. There was recently an extensive discussion about this on one of the Scala mailinglists, where someone complained about the fact that Scala's String is identical to Java's String, and thus still maintains all those mistakes in a language that was specifically designed as a "better Java". The whole thread is a great read, even if you don't care about Scala and/or Java.
  • Jörg W Mittag
    Jörg W Mittag over 13 years
    DSPs still often have variable-length bytes, I think. Or even no bytes at all. (If you interpret "byte" as "the smallest efficiently individually addressable chunk of memory". There are DSPs which can address anything from a single bit to an entire word with the same performance, no mis-alignment penalties. Arguably, there is no such thing as a "byte" on such CPUs.)
  • John Saunders
    John Saunders over 13 years
    @Jorg: by that definition, there was no such thing as a byte on the PDP-10 CPUs. Only 36-bit words were addressable. The byte pointer held a bit width, bit offset from start of word, and the address of the word.
  • Admin
    Admin over 13 years
    Personally, the way I see it is that a language isn't defined by an authority - it's defined by "the masses". If the majority of people say that a byte has eight bits, a byte has 8 bits. If the majority of people say that a hacker is someone who bypasses computer security systems, that's what a hacker is. Gay doesn't mean happy, etc etc etc. Language moves on, whatever the "thou shalt speak according to my rulebook" types say.
  • tchrist
    tchrist over 13 years
    @Steve314: Ever served on a standards committee? Understand the strict requirements of normative language in designing technical specifications?
  • Admin
    Admin over 13 years
    @tchist - the whole of programming is not in your standards committee. And a standards committee can simply say "for the purposes of this standard, a byte is <whatever> bits". Many definitions in many standards differ from those in wider use, by being more specialised, or often due to the simple fact that the committee ran out of more appropriate words to use.
  • tchrist
    tchrist over 13 years
    @Steve314: I see that the answer to my questions is therefore “no”. If you have some credentials to present that you believe somehow give you the right to talk down to me about the English language, I’d certainly like to see those. I have served on standards committees, and have several books published under my own name. My current work revolves around natural language processing and computational linguistics, which are of course of the descriptive variety. So don’t teach your grandma to suck eggs.
  • tchrist
    tchrist over 13 years
    @Jörg: If you could please post a link to that discussion thread, or simply mail me it, I’d indeed interested in reading through it. Thanks.
  • Jörg W Mittag
    Jörg W Mittag over 13 years
    @tchrist: Support for Ropes in Scala. Watch out for posts by a guy named Jim Balter, he's the one with the strongest opinions but also the most knowledge.
  • Kent Fredric
    Kent Fredric over 13 years
    Imo, this answer is better than Skeets, its more succinct and to the point.
  • tchrist
    tchrist over 13 years
    Jörg: Thanks very much; it was a good read! Jim Balter’s saying stuff I’ve been ranting on for a good while. Java’s terrible Unicode support, especially but not only in its seriously deficient regexes, has made me abandon Java for text processing. Too many weeks lost debugging buggy internals that aren’t my fault. I agree that Python has messed this up, too. I’ve gone back to Perl, which has a clean abstract character model with excellent UCD integration: names, graphemes, properties, normalization, collation, etc. Your mailing list seems universally unaware of how easy this stuff is in Perl.
  • NotMe
    NotMe over 13 years
    @Andrew: Such is the fickleness of SO.
  • Jim Balter
    Jim Balter over 13 years
    @Jörg "where someone complained about the fact that Scala's String is identical to Java's String" -- that was not the complaint, although I agree that it's a great read. :-) Scala is actually considerably worse in re Unicode than Java, because Java now has a Unicode API that allows correct processing, as horrible as it is to use ... whereas Scala's functional/iterable approach to Strings guarantees that all String handling that deals with individual characters is broken. "strongest opinions but also the most knowledge" -- if you know 1+1=2, your "opinion" that 1+1=2 will be pretty strong. :-)
  • Jim Balter
    Jim Balter over 13 years
    @tchrist Thank you for your support. :-)
  • Jim Balter
    Jim Balter over 13 years
    Could you name a computer that had a 7 bit byte? I don't believe there were any, and I think this is a confused claim.
  • Jim Balter
    Jim Balter over 13 years
    Knuth's first statement applies only to MIX machine bytes -- a MIX machine can be implemented on either a binary computer, in which case a byte holds 0 to 63, or a decimal computer, in which case a byte holds 0 to 99. His footnote makes clear that the term "byte" is not generally limited to that, so it is your statement that is wrong.
  • tchrist
    tchrist over 13 years
    @Jim: No problem. There aren’t enough of us who recognize, understand, and appreciate the deep troubles that derive from fixating on physical bytewise encodings instead of logical integer code points, and UTF‑16 just makes all these worse. I’ve been working with the JDK7 folks on squaring up the j.u.regex stuff here in this thread. Send me mail if you’re interested in discussing this. Oh drat! Looks like Groovy has the same bug as Scala WRT “characters ≠ characters”.
  • Jim Balter
    Jim Balter over 13 years
    @RedPain I obviously did read your second quotation, since that's "His footnote". The point is that your first quote is ripped out of context -- it refers to MIX bytes, not bytes generally. Knuth isn't so silly as to make the claim that "A byte ranges from 0 to 63 or from 0 to 99!" as you did. The fact is that your first quote appears under "Description of MIX", which isn't what the OP asked about, so your answer is wrong, like I said.
  • RedPain
    RedPain over 13 years
    @Jim Balter: Oh, I did read my second quotation, too. The point is that you have a dull sense of humor.
  • Jason Williams
    Jason Williams over 13 years
    It depends whether you define "byte" in the sense of "a character" (which is how it originally came about: en.wikipedia.org/wiki/Byte), or in terms of "the native size of a CPU register", which is perhaps what youre thinking of. ASCII is a very common standard and it is based on a 7-bit "byte" (in the former sense). Many computers supported ASCII, hence many computers used/supported 7 bit "bytes". Before ASCII, punched card characters were usually represented as 6 bit bytes.
  • Jalal
    Jalal about 13 years
    @RedPain: I don't believe god! :P
  • Octopus
    Octopus over 7 years
    not sure why you're on a tangent about character sets. completely irrelevant.
  • Octopus
    Octopus over 7 years
    It's commonly accepted that a byte is 8 bits, it most certainly is not the "standard definition". Architectures have existed where bytes are otherwise.