SHA-256

Cryptography is the backbone of blockchain technology; without it, it would be impossible to link each block of data together to form a secure, immutable chain. One of the key ideas of cryptography, which blockchain utilizes to this end, is the cryptographic hash, with SHA-256 perhaps being the most well-known algorithm for generating these hashes.

The Case for Cryptography

Imagine you had to design a digital system in which users could send digital tokens to each other in exchange for goods. Each user receives an account and is free to send and receive tokens to and from whomever they wish. But there’s a problem: How do we ensure that each user is who they say they are? In a centralized system, this is easy; a server can store each user’s account and require access credentials (ex. username and password). But in a decentralized system–such as a blockchain–no such server exists. Transactions in decentralized networks cannot be authenticated by simply storing consumer credentials.

At first glance, there is a trivial way to solve this problem: by simply having the system assign each person a unique ID code that they could include in each transaction to prove their identity. However, the problem with this solution lies in a scenario known as a MITM (man-in-the-middle) attack:

Imagine two users, Alice and Bob. Alice wants to send Bob 1 Bitcoin for a painting and asks Bob for his address to send payment to. In a perfect world, Bob would send a message to Alice containing his address, including his ID code, to prove his identity. Now, imagine a third user, Charlie; Charlie sits between Alice and Bob, listening to their communication. When Bob attempts to send Alice his address, Charlie intercepts the message and changes Bob’s address to his own address and keeps Bob’s ID code. Now Alice receives the message containing Charlie’s address, but believes it is from Bob. She then unknowingly sends 1 Bitcoin to Charlie instead of Bob. Since cryptocurrency transactions are final and cannot be reversed, Alice will be unable to get her stolen funds back.

Cryptographic hash functions, such as SHA-256, solve this problem by creating a “digital signature.” By design, cryptographic hash functions only work in one direction; that is, you can generate an output, or hash, from some data, but you cannot retrieve the original data from the hash. Therefore, it is virtually impossible to “forge” a digital signature. Extending this metaphor to blockchain, every user is assigned a unique private key, and their digital signature is created by inputting this private key into a hash function like SHA-256. So, the only way to generate your digital signature is by having access to your private key. Accordingly, protecting your private key from theft and exposure is extremely important.

Additionally, cryptographic hashes are deterministic, meaning that identical data creates identical hashes. A useful cryptographic hash function like SHA-256 also greatly minimizes collisions. A collision is when different pieces of data produce identical hashes. If Alice and Bob in the above scenario used SHA-256 instead, Charlie could, in theory, still alter Bob’s message to Alice. However, the key difference is that Alice would know that the message was altered because Bob’s message hash would be different than Charlie’s. Alice would compare the hash she received to the hash she expected to determine if the message had been tampered with during transit, as well as ascertain the integrity of the message.

Building a Blockchain

By design, blockchain is immutable. This means that the data contained within the blockchain is resistant to change without massive disruption of the decentralized peer-to-peer network it is constructed upon. This is due to how the linking of blocks were designed in the conception of blockchain.

Firstly, how is the order of a blockchain determined? After all, a blockchain is much more complex than a simple list. The ingenious way that the blockchain creators chose to link blocks together is by storing the hash of the previous block in the current block. So, in a way, all of the blocks in a blockchain are linked backward.

This method of linking blocks is how the data in blockchains are secured. For example, if the data in Block A were altered, this would, in turn, alter the hash of that block. Consequently, Block B (which stored the old hash of Block A), and all blocks that follow, would become unlinked from the rest of the blockchain because there is no block storing Block A’s new hash. This change to the blockchain would, in theory, be rejected by the network unless a majority decides to implement it. This is known as a 51% attack and is extremely difficult to achieve. As a blockchain network grows, it becomes even more difficult due to the amount of hash power needed to successfully execute a 51% attack.

Mining

The other primary use of SHA-256 and other hash functions in blockchains is for cryptocurrency mining. Mining is the process by which new transactions are added to the blockchain, and how new Bitcoins in particular are created. Mining works by dedicating computer processing power to verify that new transactions are legitimate. A legitimate transaction is one in which the Bitcoin being sent has not been sent by the same user before; this is known as “double-spending.” Ensuring that an individual is not “double-spending” requires scanning the entire blockchain for duplicate transactions. Verification is done by many computers’ calculating hashes from random data, known as the nonce, in an attempt to guess the correct value; whoever guesses the correct hash first gets rewarded with Bitcoin. This authentication process is known as “Proof-of-Work.” In Bitcoin, the correct hash is one that is less than or equal to the value set by the network.

Due to the nature of the SHA-256 algorithm, the number of hashes generated per second, or hash power, is directly related to processing power. In the early days of Bitcoin, anybody could mine using only their CPU. But as time went on, miners started using more specialized hardware to gain a leg up on their competition. This led to GPU, FPGA, and finally, ASIC mining. Now, Bitcoin mining is reserved for those with access to cheap electricity and plenty of capital to invest in expensive computer hardware. This drawback of SHA-256 has led to the use of different hash functions in other blockchain systems that are more resistant to the centralization of mining power.