Hello everyone! today I'm going to be talking about blockchain, what it is and give a simple code example of how to implement the simplest blockchain to ever exist.
This is a new topic to me so I thought I'd share my simple understanding. Let's start
Let's say that a company that makes regular financial transactions has a traditional accounting ledger that it uses to track every transaction it makes. The ledger can be something physical or digital it doesn't matter. Now every time the company makes a transaction it records it in the ledger & once it goes in it cannot be changed/altered in any way and if it were to be altered there will be a clear audit trail for that.
Now after giving this brief example. The transactions being added to the ledger are defined as immutable. Now immutable means that it cannot be modified in any way and if it were it would leave a trace behind it.
This ensures the integrity and accuracy of the financial records and gives a clear audit of all transactions.
Now you might ask what does all of that have to do with the blockchain? Well, a blockchain is nothing but a digital ledger to store immutable information. The data is stored in blocks and all the blocks are connected to form a chain of immutable connected data entries.
Every block has some data attached to it and the data is hashed (for example using SHA256) and the hash is stored alongside the data. Also, the hash of the previous block is stored as a link to it.
Let's go through a simple code example of what I'm trying to explain:
class Block:
def __init__(self, data, prev_hash) -> None:
self.data = data
self.prev_hash = prev_hash
self.timestamp = datetime.datetime.now()
self.hash = self.calc_hash()
def calc_hash(self):
sha = hashlib.sha256()
hash_str = self.data.encode('utf-8') + self.prev_hash.encode('utf-8') + str(self.timestamp).encode('utf-8')
sha.update(hash_str)
return sha.hexdigest()
This is a Block class that takes in the data and the previous hash from the block before it.
Then it starts calculating its hash by applying SHA256 on the data, the previous hash and the timestamp and returning a hex digest for us.
class BlockChain:
def __init__(self) -> None:
self.chain = [Block("Genesis Block", "0")]
def addBlock(self, data):
prev_hash = self.chain[-1].hash
new_block = Block(data, prev_hash)
self.chain.append(new_block)
This is the BlockChain class that simply has an array of blocks that is populated with the very first block in our chain and usually, it's called the genesis block.
We have an addBlock function that takes in data as a parameter and finds the previous blocks' hash, creates the new block and adds it to the chain.
If we run our example we'll get the following
block_chain = BlockChain()
block_chain.addBlock("Second Block")
block_chain.addBlock("Third Block")
for chain in block_chain.chain:
print("data:" + chain.data)
print("hash:" + chain.hash)
print("prev_hash:" + chain.prev_hash)
print("timestamp:" + str(chain.timestamp))
print("\n")
# OUTPUT
data:Genesis Block
hash:f3b8d5c6587606a76a01e5c6f73d64cbe060d007a60836640fd96bc77fdfd39e
prev_hash:0
timestamp:2023-06-05 23:28:11.403768
data:Second Block
hash:5c43264a82078ede4b477551a331174c31f5a0542636890beb43cd48a705d522
prev_hash:f3b8d5c6587606a76a01e5c6f73d64cbe060d007a60836640fd96bc77fdfd39e
timestamp:2023-06-05 23:28:11.403787
data:Third Block
hash:7aca056ab9c71d85323248b49f3fc46e76a4cdcd6f6684d3a57826dc4e8e338e
prev_hash:5c43264a82078ede4b477551a331174c31f5a0542636890beb43cd48a705d522
timestamp:2023-06-05 23:28:11.403792
As we see each block has its hash and its previous blocks' hash too.
Altering blocks and preventing tampering
Now if I were to alter any block in these the whole chain will be invalid. Because now the hash of the altered block would change and it changing means that the block after it will have a prev_hash that points to something that doesn't exist anymore.
But what if we compute the new hash and all the hashes of all blocks that come after that? Overall the blockchain will become valid which it shouldn't. This is a security issue.
To mitigate this blockchains use something called Proof of Work.
Proof of work is a way of validating new blocks before they get added. You can control how you want the generated hash to start (for example set a rule that the generated hash MUST start with a certain word or number) and this usually would take lots of computing power depending on the difficulty you set. You can make guessing, or in proper terms 'mining' harder by increasing the complexity of the word you set. Mining is the process of trying to solve the hash for a reward.
One last thing before showing a code example is that the block hash generated will always remain the same because the contents don't change. To mitigate this a nonce is added and it's short for a number used only once. This number gets hashed along with the other data and incremented after every trial. Let's update our code accordingly.
class Block:
def __init__(self, data, prev_hash) -> None:
self.data = data
self.prev_hash = prev_hash
self.timestamp = datetime.datetime.now()
self.nonce = 0
self.difficulty = 3
self.hash = self.calc_hash()
def calc_hash(self):
sha = hashlib.sha256()
while True:
nonce_str = str(self.nonce)
hash_str = self.data.encode('utf-8') + self.prev_hash.encode('utf-8') + str(self.timestamp).encode('utf-8') + nonce_str.encode('utf-8')
sha.update(hash_str)
hash = sha.hexdigest()
if hash[:self.difficulty] == "0" * self.difficulty:
return hash
self.nonce += 1
We update our block class data here with the nonce
and difficulty
.
Then call calc_hash
which has an infinite loop that keeps rehashing until we reach the specified difficulty. In this case, we assume that the first difficulty
characters of the string have to equal 0. Increasing the difficulty makes mining much harder.
Now we run the code below again.
block_chain = BlockChain()
block_chain.addBlock("Second Block")
block_chain.addBlock("Third Block")
for chain in block_chain.chain:
print("data:" + chain.data)
print("hash:" + chain.hash)
print("prev_hash:" + chain.prev_hash)
print("timestamp:" + str(chain.timestamp))
print("\n")
data:Genesis Block
hash:000e58859fcb5a579c5cf0a1ffa0e5e67877516342797a3b2dbc038464700ebd
prev_hash:0
timestamp:2023-06-06 00:10:36.948146
data:Second Block
hash:00048e8c35c8a80125cce8acc4e7ca5abbeb37ab4da8bdfb30318081485cfd80
prev_hash:000e58859fcb5a579c5cf0a1ffa0e5e67877516342797a3b2dbc038464700ebd
timestamp:2023-06-06 00:10:36.953128
data:Third Block
hash:000f8b59fd504bce2f483115a258fe3c67147f3972b007f85db27194eec3ff24
prev_hash:00048e8c35c8a80125cce8acc4e7ca5abbeb37ab4da8bdfb30318081485cfd80
timestamp:2023-06-06 00:10:36.954252
As we can see every hash starts with three zeroes which was what we wanted. This controls the addition of blocks to our blockchain.
Lastly, Nowadays the block validating concept used is Proof of Stake which I'll explain in the upcoming future (when I understand it ๐).
Summary
Blockchain is the best option for immutable data that needs to be stored and cannot be changed in any way shape or form.
Blockchain usually is decentralized. This means Several nodes much have consensus together when adding a new block in verifying its integrity of it by rehashing the data and checking it across the defined criteria.
It's extremely difficult tampering with block data inside the blockchain. Ensuring data Integrity.
That's it for this one! Hope you learned something new today and like always, Till the next one!