How to encode and decode data in base64 and base64URL by using unix commands?

15,092

Solution 1

This is the same suggestion as @jps but shorter. Also remember that echo by default always adds newline at the end, so when you want to encode it, you must add -n.

echo -n "Some_data_to_be_converted" | base64 | tr '/+' '_-' | tr -d '='

Decoding it back with bulit-in bash tools is more complicated as I didn't find an easy way to pad the string back with '=' so that the length will dividable by 4. Probably can be done with awk but I didn't dig deep enough. If you have local ruby it becomes trivial:

2.6.2 > require 'base64'
2.6.2 > Base64.urlsafe_encode64('test', padding: false)
 => "dGVzdA"
2.6.2 > Base64.urlsafe_decode64('dGVzdA')
 => "test"

Solution 2

If you already have a base64 encoded string you just need to replace "+" with "-" and "/" with " _" to get a base64url encoded string. To achieve this, you can use the following command:

echo Some_data_to_be_converted | base64 | sed 's/+/-/g; s,/,_,g'

(you can try on Execute Bash Shell Online)

Base64 encoding maps the input bytes (8 bit) to a 6 bit representation. 4 base64 characters can encode 4*6=24 bits, which equals 3 bytes. Whenever the number of bytes in your input can't be divided by 3, padding is required according to the standard.

The padding character is =

As the =character is used for key-value pairs in URLs, you can't use it directly for padding if you intend to use the encoded value in an URL. You can either just ommit the padding, because most implementations will still work and just ignore the 2 or 4 unused bits on the end. Or, if the receiver really needs padding, you have to replace the = by it's URL safe representation %3d.

Solution 3

tl;dr

Use basenc(1) from coreutils:

$ printf "xs?>>>" | basenc --base64
eHM/Pj4+
$ printf "xs?>>>" | basenc --base64url
eHM_Pj4-

A bit of explanation

Recent versions of coreutils include basenc(1) which supports several different encodings. From its help screen:

--base64          same as 'base64' program (RFC4648 section 4)
--base64url       file- and url-safe base64 (RFC4648 section 5)
--base32          same as 'base32' program (RFC4648 section 6)
--base32hex       extended hex alphabet base32 (RFC4648 section 7)
--base16          hex encoding (RFC4648 section 8)
--base2msbf       bit string with most significant bit (msb) first
--base2lsbf       bit string with least significant bit (lsb) first
--z85             ascii85-like encoding (ZeroMQ spec:32/Z85);
                  when encoding, input length must be a multiple of 4;
                  when decoding, input length must be a multiple of 5

Here is a string that illustrates the difference:

s="xs?>>>"

As binary:

$ printf "%s" "$s" | xxd -b -c1 | cut -d' ' -f2 | nl
     1  01111000
     2  01110011
     3  00111111
     4  00111110
     5  00111110
     6  00111110

And as 6 bit blocks (as base64 reads the data):

$ printf "%s" "$s" | xxd -b -c1 | cut -d' ' -f2 | tr -d '\n' | fold -w6 | nl
     1  011110
     2  000111
     3  001100
     4  111111
     5  001111
     6  100011
     7  111000
     8  111110

Note that block 4 and block 8 map to / and + respectively (Base64 table on Wikipedia):

Solution 4

Adding on to the answer by Kaplan Ilya, here is a command using standard linux/unix commands that can decode base64url, including handling missing padding.

Note: some versions of base64 can handle missing padding, such as Mac/BSD base64 -D. However, GNU base64 -d requires correct padding.

Also, I used the test string ~~~??? instead of the one in the original question Some_data_to_be_converted, so that it will generate +, /, = characters.

text='~~~???'

# encode base64
echo "$text" | base64
# fn5+Pz8/Cg==

# encode base64url
base64url=$( echo "$text" | base64 | tr '/+' '_-' | tr -d '=' )
echo "$base64url"
# fn5-Pz8_Cg

# decode base64url
echo "$base64url"==== | fold -w 4 | sed '$ d' | tr -d '\n' | tr '_-' '/+' | base64 -d
# ~~~???

Explanation of the decode base64url commands:

  • echo "$str"==== appends 4 equal signs
  • fold -w 4 split every 4 characters into separate lines
  • sed '$ d' deletes the last line (the extraneous padding)
  • tr -d '\n' joins all lines. Now the padding is correct.
  • tr '_-' '/+' converts _ to /, - to +.

(Side note: if you're wondering why not use tr '-_' '+/' since that would be in alphanumeric order, it's because that will result in invalid option because it thinks -_ is an option. You could do tr -- '-_' '+/', but it's easier just to swap the order.)

Share:
15,092
Admin
Author by

Admin

Updated on June 03, 2022

Comments

  • Admin
    Admin almost 2 years

    Base64 encode can be achieved by

    $ echo Some_data_to_be_converted | base64
    
    U29tZV9kYXRhX3RvX2JlIF9jb252ZXJ0ZWQK
    

    And Base64 decode can be achieved by

    $ echo U29tZV9kYXRhX3RvX2JlIF9jb252ZXJ0ZWQK | base64 -d
    
    Some_data_to_be_converted
    
    1. How to achieve Base64URL encode/decode?

    2. Is it just enough to replace "+" with "-" and "/" with " _" ?

    3. When to do the padding "#"(adding/remove "#" to be considered )?

  • Victor Martins
    Victor Martins almost 4 years
    The -n param for echo was the missing piece for me. Thank you :)
  • wisbucky
    wisbucky over 2 years
    I took it as a challenge to come up with a bash or standard unix commands to pad the string back in order to decode: stackoverflow.com/questions/58957358/…