Converting a hex string to a byte array

105,426

Solution 1

This ought to work:

int char2int(char input)
{
  if(input >= '0' && input <= '9')
    return input - '0';
  if(input >= 'A' && input <= 'F')
    return input - 'A' + 10;
  if(input >= 'a' && input <= 'f')
    return input - 'a' + 10;
  throw std::invalid_argument("Invalid input string");
}

// This function assumes src to be a zero terminated sanitized string with
// an even number of [0-9a-f] characters, and target to be sufficiently large
void hex2bin(const char* src, char* target)
{
  while(*src && src[1])
  {
    *(target++) = char2int(*src)*16 + char2int(src[1]);
    src += 2;
  }
}

Depending on your specific platform there's probably also a standard implementation though.

Solution 2

This implementation uses the built-in strtol function to handle the actual conversion from text to bytes, but will work for any even-length hex string.

std::vector<char> HexToBytes(const std::string& hex) {
  std::vector<char> bytes;

  for (unsigned int i = 0; i < hex.length(); i += 2) {
    std::string byteString = hex.substr(i, 2);
    char byte = (char) strtol(byteString.c_str(), NULL, 16);
    bytes.push_back(byte);
  }

  return bytes;
}

Solution 3

So for fun, I was curious if I could do this kind of conversion at compile-time. It doesn't have a lot of error checking and was done in VS2015, which doesn't support C++14 constexpr functions yet (thus how HexCharToInt looks). It takes a c-string array, converts pairs of characters into a single byte and expands those bytes into a uniform initialization list used to initialize the T type provided as a template parameter. T could be replaced with something like std::array to automatically return an array.

#include <cstdint>
#include <initializer_list>
#include <stdexcept>
#include <utility>

/* Quick and dirty conversion from a single character to its hex equivelent */
constexpr std::uint8_t HexCharToInt(char Input)
{
    return
    ((Input >= 'a') && (Input <= 'f'))
    ? (Input - 87)
    : ((Input >= 'A') && (Input <= 'F'))
    ? (Input - 55)
    : ((Input >= '0') && (Input <= '9'))
    ? (Input - 48)
    : throw std::exception{};
}

/* Position the characters into the appropriate nibble */
constexpr std::uint8_t HexChar(char High, char Low)
{
    return (HexCharToInt(High) << 4) | (HexCharToInt(Low));
}

/* Adapter that performs sets of 2 characters into a single byte and combine the results into a uniform initialization list used to initialize T */
template <typename T, std::size_t Length, std::size_t ... Index>
constexpr T HexString(const char (&Input)[Length], const std::index_sequence<Index...>&)
{
    return T{HexChar(Input[(Index * 2)], Input[((Index * 2) + 1)])...};
}

/* Entry function */
template <typename T, std::size_t Length>
constexpr T HexString(const char (&Input)[Length])
{
    return HexString<T>(Input, std::make_index_sequence<(Length / 2)>{});
}

constexpr auto Y = KS::Utility::HexString<std::array<std::uint8_t, 3>>("ABCDEF");

Solution 4

You can use boost:

#include <boost/algorithm/hex.hpp>

char bytes[60] = {0}; 
std::string hash = boost::algorithm::unhex(std::string("313233343536373839")); 
std::copy(hash.begin(), hash.end(), bytes);

Solution 5

You said "variable length." Just how variable do you mean?

For hex strings that fit into an unsigned long I have always liked the C function strtoul. To make it convert hex pass 16 as the radix value.

Code might look like:

#include <cstdlib>
std::string str = "01a1";
unsigned long val = strtoul(str.c_str(), 0, 16);
Share:
105,426

Related videos on Youtube

oracal
Author by

oracal

Updated on March 15, 2022

Comments

  • oracal
    oracal about 2 years

    What is the best way to convert a variable length hex string e.g. "01A1" to a byte array containing that data.

    i.e converting this:

    std::string = "01A1";
    

    into this

    char* hexArray;
    int hexLength;
    

    or this

    std::vector<char> hexArray;
    

    so that when I write this to a file and hexdump -C it I get the binary data containing 01A1.

    • dhavenith
      dhavenith almost 11 years
      @alexvii That is not an answer to this question.
    • πάντα ῥεῖ
      πάντα ῥεῖ almost 11 years
      You can set std::streams to hex mode for reading and writing numbers in hex format
    • oracal
      oracal almost 11 years
      @makulik I did try using streams and std::hex but couldn't get anything to work. Could you maybe show me an example? Thanks.
    • fkl
      fkl almost 11 years
      I don't think any ascii deduction is required, simply use the c api to convert into char array, unless i have gotten question wrong. I have pointed out the api in my ans below stackoverflow.com/a/17273020/986760.
    • Zan Lynx
      Zan Lynx almost 10 years
      Based on a comment you made to another answer I think you need to add to your question what should happen when the input is an odd number of characters. Should the missing 0 be added to the beginning of the string or the end?
    • TheoretiCAL
      TheoretiCAL about 6 years
      @oracal See my answer for a stringstream approach
  • oracal
    oracal almost 11 years
    While that does seem to work (can't try it out atm) is there a more standard way?
  • jogojapan
    jogojapan almost 11 years
    Is this a "hint" or an answer? And what do you mean by "try this"? Will it work? And is it different from the existing answers? How?
  • Anand Rathi
    Anand Rathi almost 11 years
    @jogojapan I am happy write the whole code do you really need it ? Can you see the difference in basic approach?
  • jogojapan
    jogojapan almost 11 years
    My problem is that I don't understand what you are trying to tell us. There is a hint, there is a string (followed by another version of that string with 0x prefixed), and then a very short statement about some iteration. The meaning of all this, esp. in the context of the existing answers, isn't clear to me. This will have an impact on upvotes/downvotes you'll get for this.
  • fkl
    fkl almost 11 years
    I am not sure, the original string has the same elements, why do we need to covert to ascii to get numerical equivalent?
  • Niels Keurentjes
    Niels Keurentjes almost 11 years
    @fayyazkl I don't understand what you mean?
  • fkl
    fkl almost 11 years
    @NielsKeurentjes What is wrong with using c_str() for the above? Why do we have to manually convert an ascii 'A' to hex A and put in target char *. What you did is correct. I just can't see why you have to manually do it when there is standard api available to covert string to char array.
  • Niels Keurentjes
    Niels Keurentjes almost 11 years
    @fayyazkl you misunderstood the question - this is about converting the human-readable 4-character string "01A1" into 2 in memory bytes (1 and 161). Hence ASCII conversion is obviously required.
  • Christophe
    Christophe almost 10 years
    @Niels Keurentjes wonderfull solution ! But what happens if there is an odd number of hex digits. For example AFF ?
  • Christophe
    Christophe almost 10 years
    very nice char2int() ! But I fear that the result doesn't meet expectations when with odd number of hex digits. For example, try with 6a062a063. I'd understand 6 a0 62 a0 63, but your code makes 6a 06 2a 06 3 out of it.
  • Niels Keurentjes
    Niels Keurentjes almost 10 years
    @Christophe because the while checks for *src && src[1] it would parse AF and then encounter a trailing zero on src[1], and stop converting. It's similar to atoi behaviour in that respect - it stops on corrupt input.
  • Christophe
    Christophe almost 10 years
    @NielsKeurentjes I'm not sure that atoi() ignores the last odd digit... Don't you think that it looks a little bit like a bug when AFF is handled as AF instead of 0AFF ?
  • Niels Keurentjes
    Niels Keurentjes almost 10 years
    @Christophe I was referring to atoi behaviour of parsing 'as long as it has meaningful input' - if you feed it the string 123abc it returns integer 123 (cplusplus.com/reference/cstdlib/atoi). As such I wrote this function on the same premise of sanitized input, and 'undefined or messy' behaviour otherwise. Adding input validation is rather trivial of course anyway, but not always a wanted overhead.
  • xaizek
    xaizek almost 10 years
    You're right about odd number of hex digits, @Christophe. Thank you! I updated the code to handle such case well (by the way, it's not true for the accepted answer, still better to handle such strings).
  • Niels Keurentjes
    Niels Keurentjes almost 9 years
    It should be noted that I wrote the accepted answer as the most performant full solution to the OP's question :) No questions were asked about exceptional cases, so I assumed (like many stdc functions do) pre-sanitized input.
  • Martin Bonner supports Monica
    Martin Bonner supports Monica almost 7 years
    Fantastic! I wanted a way of initializing an array from a string literal, and this is almost exactly what I need.
  • user482963
    user482963 about 6 years
    well you can always pre-append '0' to for odd sized hex string
  • Erik Aronesty
    Erik Aronesty almost 6 years
    Be sure to BN_free
  • Andriy Plokhotnyuk
    Andriy Plokhotnyuk over 5 years
    does it flag invalid input?
  • Coder1337
    Coder1337 over 5 years
    @NielsKeurentjes I know I am replying to an answer that was posted years ago, but I just want to say thanks for the solution! This is exactly what I needed, I was using another way I found online but it had a few problems and it didn't work exactly like it should have been. So thanks again! :)
  • BenV136
    BenV136 over 3 years
    Love this solution. As a minor observation, do note that char2input can be made slightly more efficient, especially if used heavily. Note that any valid character will be >= '0' so it is more efficient to test for the second character first, as in: int char2int(char input) { if (input <= '9' && input >= '0') return input - '0'; if (input <= 'F' && input >= 'A') return input - 'A' + 10; if (input <= 'f' && input >= 'a') return input - 'a' + 10; } Or test for a, then A and then 0, in that order.