Blockchain without any doubt is recently the most frequently used buzzword. Hype is huge powered by the crazy world of cryptocurrencies. The craziness of cryptocurrencies is probably the reason why lot of software developers treat blockchain as something abstract and not interesting. Let’s have a look into a blockchain by an eye of a software developers.
A lot of myths has raised and lot of developers has distorted image of blockchain. I won’t go over each myth those can be found easily on the web. I’ll just point out one and most important: Blockchain is not a Bitcoin or any other cryptocurrency.
The common misunderstanding is the difference between blockchain and Bitcoin. Well blockchain is an underlying technology of Bitcoin but it’s not a Bitcoin nor a cryptocurrency. The cryptocurrencies are just features build on top of a blockchain. Because of that (and for few other reasons) blockchain is often being referred as distributed ledger to differentiate this technology from Bitcoin.
So what is a blockchain? By an eye of a software developer it’s a data structure or a database or both. Well… it’s a set of records linked to each other and stored in a distributed repository. It’s hard to have a definition satisfying everybody as there are many variations of a blockchain and constantly new are being invented. To understand what blockchain is let’s have a closer look from a software developer perspective.
Blockchain as a data structure
Blockchain as the name stands is a chain of blocks. So what is a block than? Block is a chunk of data with accompanying meta information describing it. It can store pretty much anything from financial operations to binary code that can be executed - anything we want. Meta information validates the data stored in a block so that we can be assured the data was not altered. Let’s write a simplest script generating a block.
We’ve got our data and a digest ensuring it’s integrity. As a data you can put literally anything from plain text, JSON to binary data like images, audio or video files to executable code. Anything you want.
Right now nothing really exciting about the blockchain. Let’s proceed to the next element of a blockchain which is a chain.
In blockchain each block is linked with the previous block. Each except the first block which is a special block called genesis block. Let’s modify our script allowing the hash of a previous block to be passed as a parameter.
We’ve created a genesis block and next we’ve added new block to a chain linked to a genesis block. But what exactly we’ve gained with that?
The reason for this is to make altering the chain harder. If more blocks are added on top of a block I’ve added in order to change my block attacker must recalculate digest’s for my block and all blocks added on top of it since my block is a parameter for calculations of further blocks. If attacker wouldn’t recalculate all necessary blocks it can be quickly identified that chain was tampered.
So if 1000 blocks were added on top of my block how hard for an attacker would be to change my block? Well let’s add another block and see:
I’ve added new block on top of my previous block. It took 0.235s to calculate it. Hold on here for a minute. Recalculating 1000 blocks would take 235 seconds! With a simple ruby script! I’m sure motivated hacker can do way better than that. Where is that legendary blockchain security everybody is talking about?
Proof of work
Ok so we have a problem. We link next block with a previous block but we aren’t getting anything out of that. Calculating a hash is so fast that with any computer we can easily hack the whole chain. How to get it more secure then?
So the problem is that we’re ensuring integrity of our chain with simple hash code that can be computed very fast. The fast computation is necessary as we need to be able to verify integrity in an instant. We can’t afford perform costly computations in order to verify data - that would be counter productive. If only we could make it hard to generate new block but make it effortless to verify it’s correctness. If only we could do something like that.
Something like that already exists and is called Hashcash. It is a proof-of-work algorithm developed to limit email spam and denial of service attacks. The idea is very simple each email has to include special header for which hash was starting with certain amount of 0’s. Since you can’t predict the outcome of a hash function you need to randomly modify the header so that it’s hash matches the requirements. It’s hard to generate but easy to verify and it basically proves that computing power was spent when generating a block.
Let’s modify our script so that it generates hashes starting with certain amount of 0’s.
In this script we regenerate blocks as long as it’s hash doesn’t start with given amount of zeros. We’re doing that by appending a nonce to the hashed data. We’re trying to guess the value for which hash is going to have specified amount of zeros. The more zeros at the beginning we expect the more difficult it’s to generate a valid block. Let’s have a look:
Hash with 4 zeros was generated in 0.6 second, with 5 zeros in 11 seconds with 6 in 15. By setting the amount of required zeros we can adjust the frequency of new blocks being added to the chain For example Bitcoin adjusts the difficulty in a way that new block is added in average every 10 minutes.
In real world block needs to have a hash starting with way more than 4 or 5 zeroes. At the moment of writing this post Bitcoin block starts with 18 zeros. You need a substantial computing power to generate such block within a reasonable time. That is actually a reason why blockchains are so hard to modify. If hacker wants to modify a block he’d need to modify all blocks on top of it (and convince other nodes that his version of a chain is actual). It would take a lot of time and resources to do so for a single person and for sure new blocks will be added to the chain while he’ll be recalculating his version.
Therefore the concept of miner was introduced to the blockchain. Basically miners are volounters offering their computing power in order to find a proper nonce giving a block with desired hash. There may be different conditions under which block is valid, hash starting with zeros is just one example (and popular one) but it could be anything. Miners are receiving gratification for their effort. In cryptocurrencies world it’s usually some fee assigned with a transaction but it can be anything. Miners can be paid with real money as well.
The network of miners collectively are providing such computing power that within a blockchain network new blocks can be mined in a reasonable time. By adjusting a difficulty it’s possible to control how often new blocks are added to the chain.
Another aspect of blockchain security is distribution. By distribution we understand that there is no one central authority. Copies of a blockchain are distributed across all nodes within a network. That introduces a few problems.
When adding a new block to the chain, nodes need to agree that particular block is valid and will be added as a next block. This problem is called a consensus problem and there are few strategies how to solve it. Usually there is some threshold that if for example more than 50% of nodes agree that block is valid it is added to the chain and new version of chain is distributed across a network.
Consensus algorithms are a broad topic and I’d look onto them in future posts.
In this post I’ve only touched the surface of a blockchain topic. The main take away should be that for a software developer blockchain is like a data structure or repository. It has nothing to do really with cryptocurrencies as most people think. It is a way of storing a data or records in a distributed network with a few clever tricks making it more secure than ordinary ways of storing data. It can be brought even further as stored data may be something executable allowing software developers to develop decentralized, distributed applications.
Understanding a very basics of a blockchain should put more light on what it is and how this technology can be utilized (and that it has nothing to do with speculation on BTC).