How to portably compute a sha1 hash in C++?

10,379

Solution 1

I'm not sure whether the one using boost's UUID will do leading zeros in hash values correctly (your string should always have the same length afaik), so here's a simplified version of the example above which will do that:

#include <cstdio>
#include <string>
#include <boost/uuid/sha1.hpp>

std::string get_sha1(const std::string& p_arg)
{
    boost::uuids::detail::sha1 sha1;
    sha1.process_bytes(p_arg.data(), p_arg.size());
    unsigned hash[5] = {0};
    sha1.get_digest(hash);

    // Back to string
    char buf[41] = {0};

    for (int i = 0; i < 5; i++)
    {
        std::sprintf(buf + (i << 3), "%08x", hash[i]);
    }

    return std::string(buf);
}

Solution 2

The Qt library contains since version 4.3 the class QCryptographicHash that supports various hashing algorithms, including SHA1. Although Qt is arguably less portable than - say - OpenSSL, at least for projects that already depend on Qt QCryptographicHash is the obvious way to compute a SHA1 hash.

Example program that computes the SHA1 hash of a file:

#include <QCryptographicHash>
#include <QByteArray>
#include <QFile>
#include <iostream>
#include <stdexcept>
using namespace std;
int main(int argc, char **argv)
{
  try {
    if (argc < 2)
      throw runtime_error(string("Call: ") + *argv + string(" FILE"));
    const char *filename = argv[1];
    QFile file(filename);
    if (!file.open(QIODevice::ReadOnly | QIODevice::Unbuffered))
      throw runtime_error("Could not open: " + string(filename));
    QCryptographicHash hash(QCryptographicHash::Sha1);
    vector<char> v(128*1024);
    for (;;) {
      qint64 n = file.read(v.data(), v.size());
      if (!n)
        break;
      if (n == -1)
        throw runtime_error("Read error");
      hash.addData(v.data(), n);
    }
    QByteArray h(hash.result().toHex());
    cout << h.data() << '\n';
  } catch (const exception &e) {
    cerr << "Error: " << e.what() << '\n';
    return 1;
  }
  return 0;
}

The used Qt classes are all part of Qt core library. An example cmake build file:

cmake_minimum_required(VERSION 2.8.11)
project(hash_qt CXX)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -std=c++11")
find_package(Qt5Core)
add_executable(hash_qt hash_qt.cc)
target_link_libraries(hash_qt Qt5::Core)

Solution 3

Boost provides a simple API for computing the SHA1 hash of strings:

#include <iostream>
#include <string>

#include <boost/compute/detail/sha1.hpp>

int main(int argc, char **argv)
{
  if (argc < 2) {
      std::cerr << "Call: " << *argv << " STR\n";
      return 1;
  }

  boost::compute::detail::sha1 sha1 { argv[1] };
  std::string s { sha1 };

  std::cout << s << '\n';

  return 0;
}

That API is private to the Boost Compute library, though, because it's part of a detail namespace. Meaning that it doesn't have any stability guarantees.


Boost also provides a SHA1 hashing class as part of the Boost Uuid Library, whose API is better suited for hashing arbitrary binary input, such as files. Although it is part of the detail namespace, meaning that it is kind of library-private, it is there for many years and stable.

A small example that computes the SHA1 hash of a file and prints it to stdout:

Prelude:

#include <boost/uuid/detail/sha1.hpp>
#include <boost/predef/other/endian.h>
#include <boost/endian/conversion.hpp>
#include <boost/algorithm/hex.hpp>
#include <boost/range/iterator_range_core.hpp>
#include <iostream>
#include <vector>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
using namespace std;

The main function:

{
  if (argc < 2) { cerr << "Call: " << *argv << " FILE\n"; return 1; }
  const char *filename = argv[1];
  int fd = open(filename, O_RDONLY);
  if (fd == -1) { cerr << "open: " << strerror(errno) << ")\n"; return 1; }
  vector<char> v(128*1024);
  boost::uuids::detail::sha1 sha1;
  for (;;) {
    ssize_t n = read(fd, v.data(), v.size());
    if (n == -1) {
      if (errno == EINTR) continue;
      cerr << "read error: " << strerror(errno) << '\n';
      return 1; 
    }
    if (!n) break;
    sha1.process_bytes(v.data(), n);
  } 
  boost::uuids::detail::sha1::digest_type hash;
  sha1.get_digest(hash);
#ifdef  BOOST_ENDIAN_BIG_BYTE
  for (unsigned i = 0; i < sizeof hash / sizeof hash[0]; ++i)
    boost::endian::endian_reverse_inplace(hash[i]);
#endif
  boost::algorithm::hex(boost::make_iterator_range(
        reinterpret_cast<const char*>(hash),
        reinterpret_cast<const char*>(hash) + sizeof hash),
        std::ostream_iterator<char>(cout)); cout << '\n';
  int r = close(fd);
  if (r == -1) { cerr << "close error: " << strerror(errno) << '\n';
                 return 1; }
  return 0;
}


The used parts of Boost don't create dependencies on any boost shared library. Since Boost is quite portable and available for various architectures, using Boost for computing SHA1 hashes is quite portable as well.

Share:
10,379
maxschlepzig
Author by

maxschlepzig

My name is Georg Sauthoff. 'Max Schlepzig' is just a silly old pseudonym (I am hesitant to change it because existing @-replies will not be updated) I studied computer science In my current line of work, I work on trading system software and thus care about low-latency

Updated on June 14, 2022

Comments

  • maxschlepzig
    maxschlepzig almost 2 years

    The objective is to compute the SHA1 hash of a buffer or multiple buffers as part of a C++ program.

  • maxschlepzig
    maxschlepzig over 7 years
    You mean that you aren't sure if the boost hex algorithm prints leading zeros? This minimal example shows that it does. Btw, your sprintf approach doesn't yield the same result on little-endian architectures. That means if you use sprinf like this you must not byte swap. The other version swaps bytes because the hex algorithm iterates byte-wise over the 4 byte integer.
  • Flo
    Flo over 7 years
    Yes, you're right, I remove the bswap32() call, thanks. Yields now the same as hashlib.sha1(str).hexdigest() of Python does.
  • maxschlepzig
    maxschlepzig over 2 years
    It turns out that Boost has gained a more high-level API for SHA1 hashing strings which allows for more concise code: stackoverflow.com/a/28489154/427158