How do I shorten a base64 string?

15,284

A 100-character base64 string contains 600 bits of information. A base64 string contains 6 bits in each character and requires 100 characters to represent your data. It is encoded in US-ASCII (by definition) and described in RFC 4648. This is In order to represent your data in 20 characters you need 30 bits in each character (600/20).

In a contrived fashion, using a very large Unicode mapping, it would be possible to render a unified CJK typeface, but it would still require the minimum of about 40 glyphs (~75 bytes) to represent the data. It would also be really difficult to debug the encoding and be really prone to misinterpretation. Further, the purpose of base64 encoding is to present a representation that is not destroyed by broken intermediate systems. This would very likely not work with anything as obscure as a base2Billion encoding.

Share:
15,284
Admin
Author by

Admin

Updated on June 04, 2022

Comments

  • Admin
    Admin almost 2 years

    What is the easiest way to shorten a base 64 string. e.g

    PHJkZjpEZXNjcmlwdGlvbiByZGY6YWJvdXQ9IiIKICAgICAgICAgICAgeG1sbnM6eG1wPSJodHRwOi8v
    

    I just learned how to convert binary to base64. If I'm correct, groups of 24bits are made and groups of 6bits are used to create the 64 charcters A-Z a-z 0-9 +/

    I was wondering is it possible to further shrink a base 64 string and make it smaller; I was hoping to reduce a 100 character base64 string to 20 or less characters.

  • Syon
    Syon over 10 years
    present a representation that is not destroyed by broken intermediate systems: this is incorrect. It has nothing to do with intermediate systems being broken, and everything to do with said systems not being designed to accept binary data in the first place. Further, the "30 bits per character" and "large unicode mapping" stuff is just confusing the issue. A system can't accept data in the encoding you have (binary) so you have to give it some encoding it can accept.
  • Pekka
    Pekka over 10 years
    Much software exhibits faults and increasing the complexity of a transfer encoding simply exposes more of those faults. I was simply observing that while introducing a denser mapping requires less 'characters', the actual amount of data is no better, and the added complexity of a fringe encoding will likely swamp any other benefits.