A Cryptographic Hash Function (CHF) is an algorithm that converts variable-sized data into a fixed-sized output. Further, while hash functions may have been around since the late 1970s, cryptography is as old as Julius Caesar (i.e., 100 B.C.). In its embryonic state, what started as a simple, yet revolutionary, idea of hiding military messages from enemies back in 100 B.C., has developed into a complex and intricate idea—CHF. As a result, it is necessary to understand the basic idea of cryptography and other associated terms, such as encryption and hashing, to fully understand a CHF.
What is Cryptography?
Cryptography can be defined as the study of the ideas, methods, techniques, and strategies used to encode a message, information, or data to hide its original and intended meaning. In more primitive times, kingdoms needed secure communication channels, as they were paramount for survival, especially during wars. Consequently, kings, military commanders, and the like would encrypt sensitive information to keep their enemies from obtaining any private and/or essential information. Those messages were encrypted in such a way that they could not be decoded without a cipher key. This simple type of encryption technique is an example of symmetric key encryption.
Symmetric Key Encryption utilizes a single key that is generally referred to as a cipher key for both encoding and decoding a message. The following message is an example of symmetric-key encryption:
Zpv gpdvt po uif usjwjbm boe mptf tjhiu pg xibu jt nptu jnqpsubou.
Obviously, the above text does not make sense at first glance. However, after applying the cipher key, the actual message is revealed. In this instance, the message is uncovered by shifting every character to the previous character in the alphabet (i.e., replace b with a, c with b, d with c, etc.). As a result, the original message is as follows:
You focus on the trivial and lose sight of what is most important.
This is the most basic example of symmetric key encryption where characters are only shifted one position. Thus, to encrypt the message, every character is shifted one position forward; however, to decrypt the message, every character is shifted one position backward.
Nevertheless, as human intelligence and communication evolved, using a single, symmetric key for both encryption and decryption became trivial. A good example of the limited nature of symmetric key encryption is the Enigma device used by the German military in World War II. Therefore, those seeking to safely encrypt messages required a more sophisticated and secure form of encryption.
Unlike Symmetric Encryption, Asymmetric Encryption does not rely on a single key. Rather, it uses a combination of two keys—one for encrypting the message and the other for decrypting the code. This concept is similar to the one which has been popularized by Hollywood. Many movies depict a bank lockbox or locker secured by multiple keys—one held by the owner and one held by the bank (or some other individual). Consequently, uncovering one key by itself does not jeopardize the security of the locker’s contents. This is very similar to asymmetric key encryption. This type of encryption uses a pair of keys, making it more secure and difficult to decipher. Consequently, many modern, revolutionary technologies, such as Blockchain, use asymmetric key encryption to ensure the security and privacy of communications and/or transactions.
What are Cryptographic Hash Functions?
Cryptographic hash functions build off the foregoing background on cryptography and encryption. However, although hashing is similar to encryption in some respects, hashing is fundamentally different from encryption. For example, while encryption is a two-way function, hashing is a one-way function (i.e., once encrypted, it cannot be decrypted).
Moreover, a hash function can be defined as a function that is used to map data of an arbitrary size to generate an output of a fixed size (usually called the Hash Digest). However, if this hash function satisfies certain well-established standards of security, integrity, and other conventions of similar scope, it can be called a Cryptographic Hash Function.
At a high level, the following characteristics are some of the features that distinguish a CHF from a simple hash function:
- One-way: Unlike encryption, Cryptographic Hash Functions are one-way. Once encrypted, a CHF can never be decrypted even if an individual has the exact hashing algorithm that was used for the encryption. Simply put, CHFs cannot be decrypted. They are irreversible.
- Collision-resistant: It is probabilistically impossible to have the same output digest for two different inputs. It does not matter if the inputs are differentiated by one character or thousands of characters; the output digest will be entirely different. This feature of a CHF, where even a small change results in an entirely different hash, is termed the Avalanche Effect.
- Pre-image-resistant: Even if an individual has the hashed digest, it is probabilistically impossible to find the input that caused that hash value. For example, if an individual has an output of 100, that individual can never determine the inputs with 100% confidence.
- Second-pre-image-resistant: Even if an individual has the input, the hash algorithm, and the hashed digest, he or she may never get exactly the same output for a different input.
- Deterministic: CHFs are deterministic in the sense that one will always obtain exactly the same output for the same, single input.
- Fast: CHFs are notably fast and efficient because they largely rely on bitwise operations that are quickly computable.
- MD5: MD5 generates a 128-bit output digest. MD5 was designed by Ronald Rivest in 1991. This algorithm, however, is not suitable for most modern uses.
- SHA-1: The SHA-1 (“Secure Hashing Algorithm”) CHF generates a 40-character hexadecimal output digest for the input of any length. It was designed by the NSA in 1995 and was widely used until 2017 when it was theoretically proved that it is prone to length extension attacks (i.e., an attack whereby an attacker uses the hash and length of a message to “extend” the message and convert it to an attacker-controlled message). Since then, SHA-1 CHF has been removed from the category of CHF and is no longer considered secure for CHF purposes.
- SHA-2: The SHA-2 family has six different hash functions, including: SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, and SHA-512/256. SHA-2 CHF was designed by the NSA and first published in 2001.
- SHA-3: SHA-3 can generate configurable output sizes much like SHA-2. SHA-3 was published by the U.S. National Institute of Standards and Technology in August 5, 2015. SHA-3 is a subset of the Keccak algorithm, a primitive cryptographic family.
- BLAKE2: BLAKE2 can generate configurable output sizes like SHA-3. BLAKE2 was developed by Jean-Philippe Aumasson, Samuel Neves, Zooko Wilcox-O’Hearn, and Christian Winnerlein and published on December 21, 2012. Created to replace MD5 and SHA-1, BLAKE2 is highly efficient and faster than MD5, SHA-1, SHA-2, and SHA-3.
- BLAKE3: BLAKE3 is a single algorithm and an improvement of BLAKE2. BLAKE3 was developed by Jack O’Connor, Jean-Philippe Aumasson, Samuel Neves, and Zooko Wilcox-O’Hearn and published on January 9, 2020.
Hash functions may be adjusted to protect against potential attacks:
- Salting: Salting is the process of adding random bits of data to each plaintext input message. This ensures that two identical messages/passwords create two different outputs, which makes it extremely difficult for attackers to uncover duplicates.
- Keyed Hash Functions: Keyed hash functions are algorithms that use both a cryptographic hash function and a cryptographic key to generate a message authentication code that is also keyed and hashed. Keyed hash functions are also known as hash message authentication codes or HMACs.
- Adaptive Hash Functions: Adaptive hash functions are hash functions that are designed to create iterations based on the original input (i.e., the output generated by the original input is treated as an input and processed by the function again for a given number of times). This type of hash function is adaptive in the sense that the developer may determine how many iterations may occur.
Hash functions are widely used to ensure the integrity of messages and documents that are shared over the internet. A sender generates a hash digest for a document which he or she sends to the receiver along with the document. The receiver can then generate the hash value of that document using the same hashing algorithm to verify if he or she gets the same hash digest or not. If both hash values are the same, then the document has not been tampered with.
Similarly, most of the websites that are following modern and secured techniques to ensure the password security of their users do not directly store the user passwords or sensitive user information in their databases. Instead, websites store the equivalent hash values. Therefore, even if an attacker hacks or compromises the website’s database, the sensitive information of users remains secure.
Preneel, B. (2010). The First 30 Years of Cryptographic Hash Functions and the NIST SHA-3 Competition. In J. Pieprzyk (Ed.), Topics in Cryptology—CT-RSA 2010 (pp. 1–14). Springer. https://doi.org/10.1007/978-3-642-11925-5_1
Redhat, 2019. (n.d.). Red Hat Customer Portal. Retrieved October 25, 2020, from https://access.redhat.com/blogs/766093/posts/1976023
Sobti, R., & Ganesan, G. (2012). Cryptographic Hash Functions: A Review. International Journal of Computer Science Issues, ISSN (Online): 1694-0814, Vol 9, 461–479.
What are cryptographic hash functions?, SYNOPSIS (Dec. 10, 2015), https://www.synopsys.com/blogs/software-security/cryptographic-hash-functions/.
 Preneel, B. (2010). The First 30 Years of Cryptographic Hash Functions and the NIST SHA-3 Competition. In J. Pieprzyk (Ed.), Topics in Cryptology—CT-RSA 2010 (pp. 1–14). Springer. https://doi.org/10.1007/978-3-642-11925-5_1
 Redhat, 2019. (n.d.). Red Hat Customer Portal. Retrieved October 25, 2020, from https://access.redhat.com/blogs/766093/posts/1976023
 Sobti, R., & Ganesan, G. (2012). Cryptographic Hash Functions: A Review. International Journal of Computer Science Issues, ISSN (Online): 1694-0814, Vol 9, 461–479.
 What are cryptographic hash functions?, SYNOPSIS (Dec. 10, 2015), https://www.synopsys.com/blogs/software-security/cryptographic-hash-functions/.