Hash Functions

Hashing is a one-way mapping of data. It takes a variable-length input and produces a fixed-length output called a message digest or simply a hash. Unlike encryption, which can be decrypted, hashes cannot be transformed back into the original message.

The diagram below shows some sample hash values for some encrypted words and phrases.

Graphic that shows sample hash values for some encrypted words and phrases. The left column shows variable-length inputs: the words “Students,” “Students are graded” and “Students are graduated.” Arrows lead to the middle column, which represents the cryptographic hash function that processes plain text and converts it to a hash value. Arrows also lead to the right column, called hash value or digest. Here we see that “Students” has been converted to the 24-character hash 3LCR 8CBO 6DH3 EW4D ABK8 JLE2. Below that, “Students are graded” has also been converted to a 24-character hash AB3F CCD4 67DA FD3F JKL7 8DML.
Sample Hash Values

Common Usage

Hash functions are commonly used to validate file integrity, to prove authentication of issuing parties in digital certificates, and to check against passwords stored in application databases.

When a user attempts to log onto a system that requires a password, the system typically takes the password, converts it to a hash using a specific hashing algorithm, and checks to see whether the password matches the stored hash within the system's user database. If the two hash functions match, the user is allowed to log in; if not, access is denied. Hash functions thus ensure confidentiality and integrity by preventing operating systems and applications from saving passwords in plaintext.

Hash functions produce a "fingerprint" known as a signature, which can be used to determine the integrity of any data element. More specifically, if even one bit of data is modified in the data input, a completely different message digest is produced.

However, as hash functions generate a fixed-length hash for any length of input, there is a chance that two values can produce the same hash value, an eventuality that is referred to as a collision. Collision is rare, but it can happen, more often when a smaller key space is used. Therefore, it is better to use hash functions that produce longer-length fixed output. For example, the chance of collision is smaller for a 192-bit output than for a 128-bit output. The longer length protects against a sophisticated class of attack known as the birthday attack.

Examples of Hash Functions

The most widely used hash function today is the Secure Hash Algorithm (SHA) developed by the National Institute of Standards and Technology (NIST) in 1993.

In 2002, NIST produced a revised version of the federal information processing standard (FIPS 180-2), which defined three new versions of SHA, with hash value lengths of 256, 384, and 512 bits, known as SHA-256, SHA-384, and SHA-512, respectively. Further revisions are expected (Keswani & Khadilkar, n.d.).


Hash functions are not particularly vulnerable to attack. However, brute force is one way to attempt to attack hash functions. The greater the length of the hash value, the longer the amount of time that is required to break the hash, often making brute force an ineffective attack method.

A more effective way to attack hash functions is to use rainbow tables, which offer a shortcut to breaking the hash.

However, creating rainbow tables for each hash function (Message-Digest Algorithm 5 [MD5], SHA1, SHA-256) takes a considerable amount of time and space. Rainbow tables that have both upper and lower cases of alphanumeric characters, including special characters, can range from a few hundred gigabytes (GBs) to a few terabytes (TBs).


Keswani, A., & Khadilkar, V. (n.d.). The SHA-1 algorithm. Lamar University Computer Science Department, Beaumont, TX. cs.lamar.edu/faculty/osborne/5340_01/summer_06/.../SHA/Project_Paper.doc