The New Way To Generate Hashes In Drupal 7

Posted: May 13, 2010 In

When we write Drupal modules generating hashes is a common task. They are usually generated as a md5 hash. Because of some new regulations for a large group of Drupal users it's time to migrate away from md5 to the SHA-2 family of hashes. Drupal 7 core has already migrated away from md5 and it's time for contrib modules to follow.

Why Change to SHA2

The software powering United States government websites are going to be required to use SHA-2 according to FIPS 180-3 and FIPS 198-1. Now, these requirements don't mean that every hash has to be a SHA-2 hash. But, any case a hash is used that's not SHA-2 needs to be explained. And, this is most likely going to happen each time Drupal is used. Can you imagine being a vendor building a Drupal website for the US government and having to explain, in writing, each use of a hash that is not SHA-2? And, doing that each time you build a website for them? Yikes, that would be annoying.

Some will say they don't care about supporting the US Government or vendors who build sites for them. Yet, when sites are built for them, like the White House, it is seen as a big deal and a win.

Modules that are used on sites, like the White House, gain some popularity. Look at the usage on the context module after its usage was announced on the White House.

All around it seems like a better idea to migrate to SHA-2 hashes and be done with it. As a practice building modules that can be used by a wide array of users is good and healthy.

The 4 Ways To Generate Hashes

Following how Drupal 7 implements the SHA-2 hashes there are 4 ways we can implement hashes depending on what we need.

  1. The simpliest is to use the built in PHP function to generate a SHA-2 hash. It would look something like:
    hash('sha256', $data);
  2. A standard SHA-2 hash is not url safe. Drupal 7 provides drupal_hash_base64 which generates a SHA-256 based hash that is url safe.
  3. There are occasions where a module may want to generate a HMAC. In those cases using the built in PHP hash_hmac will work. For example:
    hash_hmac('sha256', $data, $key);
  4. The hash generated by hash_hmac for SHA-2 is not url safe. For those cases Drupal 7 provides a helper function in drupal_hmac_base64.

These examples use SHA-256. Other varients, like SHA-512, are acceptable. SHA-256 ends up being faster. Some places in Drupal use stronger hashes. For example, Drupal 7 default password hashing uses SHA-512.

Reader Comments

I think you meant Drupal 7 provides drupal_hash_base64 not drual_hash_base64.

oops. Thanks for pointing out the spelling blunder. Fixed.

Very timely! I had just sent a question about md5() hashing to someone in response to his code review of the Commerce Checkout module. My rationale was that it's not actually obfuscating data, just creating a simple hash for a serialized array... But if even that would require special explanation, it's not worth it. hash() it is!

So instead of using standard URL encoding techniques, Drupal uses a non-standard base64 encoding variant just so that people don't have to learn how to escape things properly. Awesome.

There is a reason for the base64 setup. hash() in drupal_hash_base64() is told to return raw binary information. This is shorter than a hex character value but cannot be represented as text. Base64 encoding is used to create text representation that can be passed in a url. This creates a smaller value.

The smaller value is the specific reason and intentional reason. This produces a 44 character value rather than the 64 characters from the normal sha256 algorithm.

This is not about escaping techniques. It's about the size of the text string being passed around in the url.

Yes exactly, the reason for this encoding is it's one third shorter than the hex equivalent.

Also this encoding style is not new at all:

http://en.wikipedia.org/wiki/Base64#URL_applications
http://docs.python.org/library/base64.html#base64.urlsafe_b64decode