Cryptography is the backbone of blockchain technology; without it, the technology would be unable to link each block of data together in order to form a secure, immutable chain. One of the key concepts of cryptography that makes a blockchain infrastructure possible is the cryptographic hash — with SHA-256 being perhaps the most well-known algorithm for generating such hashes.
The Case for Cryptography
Imagine you had to design a digital system in which users can send tokens to each other through the system in exchange for goods. Each user receives an account and is free to send and receive tokens to and from others. One notable problem arises: How to ensure that each user is who they say they are? In a centralized system, this is relatively easy; a server can stores each users’ account and restrict access to an account unless an individual has credentials (ex. username and password). But in a decentralized system like blockchain, no such centralized server exists.
At first glance, a trivial method to solve this is to assign each person a unique ID code that they can include in each transaction to prove their identity. However, the problem with this solution lies in a scenario known as a MITM (man-in-the-middle) attack.
Imagine two users, Alice and Bob. Alice wants to send Bob 1 Bitcoin for a painting and asks Bob for his address to send payment to. In a perfect world, Bob would send a message to Alice containing his address, including his ID code, to prove his identity. Now, imagine a third user, Charlie; Charlie sits between Alice and Bob, listening to their communication. When Bob attempts to send Alice his address, Charlie intercepts the message and changes Bob’s address to his own address but keeps Bob’s ID code. Now Alice receives the message containing Charlie’s address, but believes it is from Bob. She then unknowingly sends 1 Bitcoin to Charlie instead of Bob and is unable to get it back.
Cryptographic hash functions like SHA-256 solve this problem by creating a “digital signature.” By design, cryptographic hash functions only work in one direction; that is, you can generate an output, or hash, from some data, but you cannot retrieve the original data from the hash. Therefore, it is virtually impossible to “forge” a digital signature. Extending this metaphor to blockchain, every user is assigned a unique private key, and their digital signature is created by inputting this private key into a hash function like SHA-256. So, the only way to generate a digital signature is by having access to the corresponding private key.
Additionally, cryptographic hashes are deterministic. This means that identical data will create identical hashes. A useful cryptographic hash function like SHA-256 also greatly minimizes collisions, which is when different data produce an identical hash. If Alice and Bob in the above scenario used SHA-256 instead, Charlie could, in theory, still alter Bob’s message to Alice. The key difference is that Alice would know that the message was altered because Bob’s message hash would be different than Charlie’s. Alice would compare the hash she received to the hash she expected in order to determine if the message had been tampered with during transit, as well as ascertain the integrity of the message.
Building a Blockchain
By design, a blockchain is immutable. This means that the data contained within the blockchain is resistant to change, which guards against disruption of the decentralized peer-to-peer network it is constructed upon. This is due to the manner in which blocks are linked in a blockchain.
How is the order of a blockchain determined? After all, a blockchain is much more complex than a simple list. Blocks are linked together is by storing the hash of the previous block in the current block. As a result, all of the blocks in a blockchain are mathematically linked backward.
This method of linking blocks is how the data in blockchains is secured. For example, if the data in Block A were altered, this would, in turn, alter the hash of that block. Consequently, Block B (which stored the old hash of Block A), and all blocks that follow, would become unlinked from the rest of the blockchain because there is no block storing Block A’s new hash. This change to the blockchain would, in theory, be rejected by the network unless a majority (or the governing consensus protocol) decided to implement it. Where the majority is the effective consensus protocol, this is known as a 51% attack. As a blockchain network grows, such an attack generally and theoretically becomes more difficult due to the amount of hash power needed to execute a 51% attack.
The other primary use of SHA-256 and other hash functions in blockchains is for mining. Mining is the process by which new transactions are validated and added to the blockchain, and in the case of Bitcoin, how new Bitcoins are created. Mining works by dedicating computer processing power to verify that new transactions are legitimate. A legitimate transaction is one in which the Bitcoin being sent has not been sent by the same user before; this is known as “double-spending.” This requires scanning the entire blockchain for duplicate transactions. Verification is done by many computers’ computing hashes from random data, known as the nonce, in an attempt to guess the correct value; whoever generates the correct hash first is rewarded with Bitcoin. This is known as Proof-of-Work. With respect to Bitcoin, the correct hash is one that is less than or equal to the value set by the network.