• Three bottom technologies
    • Encrypting data relationships
    • Data cannot be tampered with
    • Peer-to-peer networks keep data online
  • Core Technology Concepts
    • block
    • Mining and consensus mechanisms
    • Merkle Tree
  • What’s good for blockchain and what’s not?
  • Blockchain application
    • Trading model
    • Identity authentication system
    • Intelligent contract
  • conclusion

I have been studying blockchain recently. I have heard about it before, but now I know something about it. Let’s talk about the feeling in the learning process. That is, “Do it! Let me be clear.” Most of the material is disorganized, or too thin for a single point. As for those who are still holding open mic Shouting what is blockchain, miss regret for a lifetime, are liars.

I believe that many investment institutions are eyeing blockchain. This year, even AI is not as hot as blockchain. So for a blockchain project, whether it is reliable or not, is drawing cake, or is really possible, everyone should pay attention.

Three bottom technologies

When talking about blockchain, it is best to put aside the price of various coins. As you know, price is not controlled by human beings. However, projects based on blockchain technology can actually be sustainable. Therefore, understanding blockchain technology is more realistic than coin speculation.

The word “blockchain” doesn’t quite explain the whole story of this technology, and if you had to use a fully descriptive name, I think it would be “peer-to-peer Encrypted non-tampered Database”, which means “a peer-to-peer Encrypted non-tampered Database”.

It is not a database (such as MySQL, MongoDB), nor is it a kind of database (such as SQL, NoSQL), it is a kind of database architecture, it in the technology of the database itself has also risen a layer, considering how to ensure the reliability of data, and how the database service does not go offline. Therefore, you can’t compare it to a generic named database, and you can even use other databases to help store and retrieve data for a specific blockchain implementation.

Encrypting data relationships

In our common database, both relational and non-relational, there may be a relationship between our different record, also may not exist, but in block chain, a certain data and another data link, even if is not connected in the real business logic, but it always exists in the chain, not out of the chain, There’s always a path from one data point to another data point, don’t read on.

A “block” represents the final representation of the data relationships within the blockchain, a record, whatever information it may be, that is eventually placed (or retrieved from) in a block. And between blocks, is a “linked list” of data relations, will be able to program people know what is a linked list, that is, after a data there is a key to the previous data index. Therefore, any two pieces of data on the blockchain can always be eventually linked together by these index keys, and data cannot escape this logic.

But the word “blockchain” doesn’t explain the difference between such a data structure and a regular database structure, because linked list data structures like the one described above can be built with regular databases if you want to.

The real value is that blockchain uses the principles of cryptography, the existing encryption technology, to encrypt these index relationships layer upon layer, so that in the stored data, the index keys are not so obvious, but need to be calculated. For example, when a block stores a bunch of transaction information, it is stored in the merkle tree style. The parent node is the result of the double hash of the two child nodes, and the Merkle algorithm ensures that the transaction information cannot be tampered with.

We don’t need to know exactly what encryption does here, we need to understand that the blockchain is full of encryption, which is a significant feature.

Data cannot be tampered with

The data on the blockchain is immutable, everyone says so. But in fact, the data can be changed, just say that after the change of your own, and the modified data block after all blocks will be invalid. Blockchain networks have synchronization logic. The entire blockchain network always keeps all nodes on the longest chain, so when you make changes, the network synchronizes and the changes are overwritten. This is an aspect that cannot be tampered with.

More interestingly, blockchain uses cryptography to ensure that data access is subject to rigorous verification, which is almost unforgeable and therefore hard to tamper with. Encryption does not mean immutable, but immutable is achieved through a combination of encryption and economic principles. It’s a bit metaphysical, a purely technical thing, and it’s maintained by theory. But that’s what happened. This is the legend of mining.

The mining process is actually a miner’s effort to create a block, and once the mine is found, the miner is entitled to create a new block. How to calculate to dig a mine? Through a series of complex encryption algorithms, starting from 0 to infinity, find a hash value that meets the difficulty, and get this value, it is mined. This algorithm process is called “consensus mechanism”, which is to determine who has the right to account by what form. There are many consensus mechanisms, and which consensus mechanism is the best for blockchain is completely chosen by combining the practical purpose of blockchain with economic principles.

After mining, miners then package transactions that are broadcast to the network into this block, in the case of bitcoins. Is a transaction legal? Did whoever initiated this transaction fake a transaction? To ensure the legitimacy of a transaction, it is necessary to find the authenticity of the source of the transaction from the existing block in front of it, and how to verify the authenticity of the transaction? In the previous block, the merkle root hash from which the transaction originated is stored. You can determine if the transaction is valid by finding the block where the transaction is located and performing a Merkle check again. The Merkle root Hash is obtained by continuously encrypting all transactions within the block, so as long as the transaction is fake, you don’t get the Merkle root Hash. Encryption, in turn, helps with data reliability.

In addition, there are many encryption rules and algorithms in blockchain. These encryption rules and algorithms make the whole blockchain follow a rule, which makes the cost of tampering with data extremely high, so that participants have no interest in tampering with data, or even fear it. This is again the place of metaphysics.

Peer-to-peer networks keep data online

If the blockchain didn’t have a P2P network, it just had cryptography, it had chain features, and it ran on a single server, the centralized model that we have now, it would look pretty fun. But the inventors wanted to play bigger, encryption makes data immutable, but I could just drop an atomic bomb and blow up your computer room, not immutable, but dead.

To prevent an atomic bomb from blowing up the machine room, inventors designed point-to-point networks (clients communicating directly with clients, rather than passing through a particular server) into the blockchain. Simply be in the peer-to-peer network, all the computer keeping the same a data structure (a complete “block” is in fact the “chain”), their mutual connection, through the network synchronization, when miners to create a new block, then the others block synchronization to keeping data structure. So no matter which node on the network is blown up, the other nodes are still alive, and new friends can synchronize data from those nodes to their own computers. If you want blockchain data to disappear, blow up the earth.

This kind of design with peer-to-peer networks is called “decentralization”. As long as there is a single node on the network, the blockchain data will not disappear.

Even more frightening for politicians is that the stored data, which users on the nodes can look at at will, doesn’t matter, is completely open. Node user since the data synchronization over, you can freely use, is your data, how to use how to use. Just imagine, one day Taobao says I want to blockchainize my data… Can’t bear to see…

Core Technology Concepts

What has been stated above is only the reason why blockchain is the foundation of blockchain. The point of this section is that now that you have blockchain in front of you, we need to analyze the specific technology points or architectures used in blockchain.

block

Blocks have already been mentioned. So what exactly is a block? Block is the main data storage structure of block chain. A block contains two parts: block head and block body. The block header is the highlight of the block.



A diagram of a block structure in a blockchain

For a block, it is a special data structure. Its block header contains some fixed information: Version (version of the client each upgrade client software, this information would be different), the block height (is actually said it is in the chain which one block), block hash (hash value of the module, is to dig the), on a piece of piece of hash (this field is the key in the key, is the key to form chain table structure), the timestamp (block creation time), Difficulty is related to the Nonce (which is related to mining, but will be covered in more detail later) and merkle root (the merkle root hash value of the block body, which will be covered briefly later in this article but will be covered in more detail in another article). In addition to these fields, if you make your own blockchain, you can add some other information to the block header.

A block is a location where specific content is stored, and in the case of Bitcoin’s blockchain, a block holds information about transactions over time. In other blockchains, it may not be the transaction information, it may be other information, but the block is the specific business information of what the block is used to do.

In some blockchain implementations, a block can also have a block tail, which is used to hold some information after the block is created. This information may be added after the block head and block body have been created, such as the length and capacity of the block.

This is a block. The previousHash field in a block header stores the hash value of the previous block, so that the previous block is known from this block, which in turn knows the previous block, until you can trace back to the first block in the chain. This is blockchain.



A schematic of how blocks form chains

As shown above, the next block always points to the previous block. Once a block is generated and a block points to it, it cannot be modified, because once modified, all hashes need to be recalculated. But as we know, is the feature of hash algorithm, want to get this hash over the hash algorithm must use the original content, so, if you give and the content of the original content is different, is not the hash, so, the middle one block the hash chain is modified, may not be at the back of the block, block chain will break. If a broken blockchain is added to the network, it will either not be recognized and other nodes will not treat you as a legitimate node, or you will have to synchronize again, replicating the longest chain from the network to your local machine to overwrite the original chain.

However, you may have two questions: 1. This blockHash is not a hash of content, so how do you ensure that the information inside the block is not modified? What if I didn’t change blockHash, but just changed the content? 2. What if there are two blocks pointing to the same block, but the block bodies are different?

For the first question, we need to know this principle through the combination of mining and Merkle tree. The second question is, in fact, this is very common, the probability of success in mining is actually 100%, the key is which miner gets to the mine first, usually when the miner gets to the mine, it will broadcast to the whole network, the other miners who don’t get to the mine will stop. However, due to network delays and other conditions, it is possible that in a short period of time, multiple miners can dig the mine together, they all create new blocks, and broadcast to the network. This condition is called “bifurcation”.

When forks occur, there are two ways to do it. But it’s all natural, no human intervention. The miner behind the new block will decide for himself which of the branches is the last block as his previous block. If one strand is significantly longer than the other within a shorter number of blocks, the longer strand is retained and the shorter strand is discarded, and the miner who dug the shorter strand has worked for nothing. When all network nodes are synchronized, the longest chain will be selected for synchronization, and the miners behind will also choose the longest chain when creating new blocks.

But there is also the case where the miner of the short chain is stubborn, or the two chains are unable to decide the winner in a short time, or even each chain ends up with many pieces behind it. A group of people can’t have two bills for their transfers. If they turn out to be different, it will be a problem. So when that happens, the miners decide to split up. That is, a blockchain becomes two chains, one of which duplicates all the previous chains into an independent chain. From then on, the two chains are no longer related. Although the previous block is identical, the latter is not. This is known as a “hard fork” and is how BitCoin Cash was born. The new chain inherits the previous block, but the latter block is entirely determined by the miner who dug the chain. The benefit of a hard fork is that for the original user, suddenly one of their assets becomes two 😂

As for the block body, it is really designed according to the business requirements of the blockchain application. Bitcoin, for example, is designed as a transaction model, all the transaction records in the block body put together, is a long bill, every penny of the ins and outs written clearly, if the Red Cross used blockchain to donate, there would not be so big a storm. But other blockchain applications are not necessarily transactional models, like the blockchain for recording medical information, the blockchain for recording the user’s location… So, as far as blockchain technology itself is concerned, it stops here and goes on to bitcoin’s unique technology.

Mining and consensus mechanisms

Mining has been mentioned many times. To put it simply, the mining process is a group of miners competing for the right to create a new block. In the world of cryptocurrencies, the right to do so leads to a transaction at the top of the block. The money in the transaction comes from nowhere, hence the “mining reward”, and the amount is quite large, which is why miners fight for the right to keep accounts. But in other non-coin blockchain applications, how can miners be motivated to mine without this reward? This is also a mysterious and mysterious topic in the field of blockchain, there is no answer so far.

So mining mining, in the end is how a technical algorithm process?

In mill process, specifies how to get a hash, and the regulation, is called a consensus mechanism, all the miners carried out in accordance with the mechanism of the consensus to a certain algorithm, see who first get a qualified as a result, and the results can be easily validation is qualified (authentication mechanism is in line with the consensus). Different blockchains have different consensus mechanisms. Currently, PoW and PoS are well known, as well as other consensus mechanisms derived from them.

In order to clarify the mining process, we only use the PoW that bitcoins follow to demonstrate.

SHA256(SHA256(version + prevHash + merkleRoot + time + currentDifficulty + nonce )) < TARGET

Mining machine to execute the above formula, as long as meet the above formula (the execution result is true), even if mined. Now let’s explain this formula.

  1. The miner will perform a double SHA256 operation, and all the operation parameters are actually the information in the bulk, but because the block has not been generated at this time, so the information is temporarily saved, if the right to account, the information will be recorded
  2. Version is the version of the client software currently running the mining machine. Each version upgrade may affect some parameters, such as expanding the block size from 1M to 2M, but it remains unchanged for the mining algorithm
  3. PrevHash is the hash value of the previous block
  4. MerkleRoot is the root hash of the Merkle algorithm for the transaction temporarily stored in the current miner’s memory, which Merkle will discuss below
  5. Time is the current timestamp
  6. CurrentDifficulty is the currentDifficulty, which is calculated by a formula, which iscurrentDifficulty = diff_1_target/TARGETThe diff_1_target in this formula can be considered a constant in the Bitcoin client and has a value of 0x1D00FFFF. And, of course, it could change, but it doesn’t change that much so this is a constant. TARGET, we’ll talk about that later.
  7. Nonce is a positive integer, and the value of nonce is the value that the miner is looking for. When the mining machine starts to execute the double SHA256 algorithm, the nonce is 0. If the above formula cannot be satisfied after the first execution, the nonce will increment by 1 and execute the algorithm again. If the formula cannot be satisfied, the nonce will continue to increment by 1 and execute again. Even if I find a mine. So, this nonce can be different every time you mine, it’s completely random, and it’s up to luck. But anyway, if you look at the nonce value in each block, you know how many operations the miner has done, and you know how hard it is to find the mine.
  8. TARGET is the TARGET value for comparison. It is a specific value. The inventors of Bitcoin wanted 10 minutes to produce a block, so TARGET was originally designed so that currentDifficulty would reach a value that would guarantee a block for 10 minutes. But the actual situation is not possible to ensure that 10 minutes must be a block, if the calculation force drops, the time will be longer, then we should adjust the difficulty, use block time as far as possible to recover in 10 minutes or so. So 2016 blocks (2 weeks) TARGET will be adjusted once, and if it is true that 2016 blocks are generated over 2 weeks, TARGET will be increased appropriately to reduce currentDifficulty and make the following 2016 blocks less difficult. On the contrary, increase the difficulty. This adjustment algorithm will not be developed in this paper. So TARGET is a constant 2016, but generally changing value, and its goal is to generate a block in about 10 minutes.

This is bitcoin mining, where miners look at their machines every day, execute this formula over and over again, looking for that nonce, day after day, year after year.

When the mining machine finds the Nonce, it wins the billing rights and can pull out the transaction information in memory and package it into the block body. Merkle root is calculated from transaction records, so it’s impossible for a miner to tamper with the package. Once he falsified a transaction, the transaction packaged into the block would not get the Merkle root that he mined, and would not get the hash that he mined.

Of course, in fact, bitcoin mining can be more troublesome, such as nonce overflow, mining pool…

Merkle Tree

I’ve said Merkle so many times. What is Merkle? Merkle Tree is a data structure, and bitcoin is a binary Tree, where each parent node has two children. I previously wrote an article titled “How blockchain uses Merkle Tree to Verify the authenticity of Transactions” in which I described in detail the principles and problems of Merkle Tree. Here is mainly to do a popular science, not in-depth.

Where does Merkle Root come from? It is obtained by performing the Merkle algorithm on the records in the block body. In the case of Bitcoin, a block contains n transactions, and we group these transactions in pairs, two by two, to get n/2 groups. If there is an odd number, then the last transaction is copied as a round. Hash = double hash = double hash = double hash = double hash = double hash = double hash = double hash = double hash

parentHash = sha256(sha256( hash1 + hash2 ))

That is, concatenate the two hashes in the group and compute a new hash, which becomes the parent node of the two hashes. Once you’ve got the parents of all the groups, you follow the same logic, you get the parents, and so on, and so on, and you end up with a root node, which is merkle root.

The Merkle algorithm looks something like this, but it’s not limited to the blockchain domain, it can be used in any validation domain, and it doesn’t have to be two to get one, it can be any one to get one, but merkle is an algorithm that evaluates multiple records to get a root hash.

As you can imagine, a small change to any transaction in the participating algorithm will result in a different Merkle root, so you only need to save the Merkle root in the block. This is the power of cryptography.

What’s good for blockchain and what’s not?

There are two main characteristics of blockchain data: 1. It is open and transparent, and any node has full rights to view the data; 2. 2. Difficult to forge or tamper with. Thus, blockchain is well suited to two types of scenarios: 1. Evidence; 2. Supervise. If the information on the blockchain is legally recognized, then the infringement party will have no answer if the evidence on the blockchain is presented. However, if tax revenue is completely transferred to blockchain, every tax of every citizen will be used in the end, which may be the point that some people fear.

But blockchain has two major drawbacks: 1. Mining, and the risk of forking, which means that when a piece of data is put on the blockchain, it takes a long time to become untampered with and reliable. 2. Partition blocks. Data is split and stored, which brings great trouble to query and greatly affects efficiency. Therefore, blockchain is not suitable for scenarios where immediacy is required, whether it is the immediacy of information exchange (such as chat) or the immediacy of queries (such as search engines).

Blockchain is not a panacea. Some services are obviously more efficient and cheaper in the centralized mode, but they want to implement blockchain in order to wind up in the air. It can only be seen whether the leves are long or not. Another concern is whether the information on the blockchain will cause great damage to personal privacy because it is open and transparent and cannot be deleted. Just think, the boy who gave The Computer to The Blockchain showed off the photos he found… The harm to the person involved… Even death does not dissipate…

Blockchain application

With the advent of tuyere, blockchain applications rise and fall. But at present, there are no more than three mature models: 1. Bitcoin; 2. Ethereum smart contracts; 3. Bitcoin stocks. The rest, without further ado, is up to the reader.

What is the architecture of a blockchain application? What other technologies need to be built on top of blockchain itself?



Blockchain Application Architecture Diagram (Shao Qifeng et al. “Blockchain Technology: Architecture and Progress”)

Block chain applications, such as COINS, Ethernet, and block the chain itself has a larger coupling, that is to say, block chain as a relatively independent database can’t become a module in the application, and we are now popular B/S structure is slightly different, chain applications will take the block chain, after dismantling in application layer for other fusion, finally realizes the function of the application.

Let’s take Bitcoin as an example to illustrate what parts a blockchain application has more than just a blockchain itself. Bitcoin is a blockchain-based billing system. In addition to blockchain, it also includes: 1. Trading model; 2. Identity authentication system (similar to PKI); Smart contracts.

Trading model

A transaction model is a record of transactions stored in a block body. The reason why each and every amount of money in Bitcoin is clear is that it relies on the transaction model. An account number in our real bank will only tell you how much an account has, how much it has spent, how much it has earned, and how much it still owes. But it doesn’t tell you “a certain amount of money you spend comes from a certain amount of income.” But bitcoin must tell you such logic, a transaction contains “input” and “output” two parts, for example, you want to transfer 10BTC, then your account must have one or more “input” the total is equal to or more than 10BTC, and the output refers to who you want to transfer this 10BTC. But there is a situation where all the “inputs” add up to 10.5BTC, like if you have 100 yuan of hair and go to buy something for 70 yuan, you need “change”. So “output” sometimes has a transfer to their own, that is, “change”.



Schematic diagram of input and output of Bitcoin transactions

In fact, the output is in another transaction, which is the input of this new transaction.

Within the block body, these transaction records, as well as their inputs and outputs, are faithfully recorded. In addition, merkle calculations are performed, storing merkle root in the block header.

Identity authentication system

Since it is a transaction, the identities of the parties involved must be involved. There are two accounts on either end of a bitcoin transaction, and it doesn’t care who it is, but it uses an encryption algorithm to ensure that a transaction is initiated by one account, and the person who initiates the transaction signs the transaction information.

You’ve probably heard of asymmetric encryption, public and private keys. The key to a Bitcoin account is the private key. Once the private key is lost, you can’t prove that you are the owner of the account, and you can’t transfer money from the account to do the signature action. You can’t spend the money in the account, and you lose the money.

So what’s the process of signing? How do you prove that I sent this transaction? How do you prove that the money was transferred to me?

Key, address and wallet

The key usually refers to the private key corresponding to the ownership user to protect bitcoin assets. Sometimes, it is also vaguely referred to as the private key and the public key. Here, the narrow definition of private key prevails.

Address The receiving address of a bitcoin. In most cases, it refers to the encapsulation of a public key (sometimes in addition to the public key and script).

Wallet A piece of bitcoin client software that is a container for private keys, usually through an ordered file or a simple database. Bitcoin wallets contain private and public key data, although public key data theoretically need not be stored.

In general, a user’s public key and bitcoin address can be equated, but they are not. A Bitcoin address is a string much shorter than a public key, mainly for ease of entry. Public keys are used for all kinds of asymmetric encryption.



Bitcoin public key to bitcoin address

Both the public key and the address are public in the Bitcoin network. Only the private key is kept by the user and cannot be given to anyone. When a transaction initiated, deal with sponsors “private key” and “receiver public key” to sign for the deal, then the others in the network can use the originator of the public key to verify whether the deal he started, for the receiver to provide their own private key to decrypt arithmetic, to prove that the deal is sent to your own. The bitcoin client (wallet) does this encrypting, decrypting and signing.

Intelligent contract

Bitcoin itself already has a prototype of a smart contract, but the scripting language it uses is weak and the logic of the contract is not complex. Ethereum extends the smart contract part of the chain on this basis, which greatly enhances the programming capability of smart contracts.

In the input and output mentioned above, the output is actually the input of another new transaction. The output of bitcoin isn’t just telling the system how much money to forward to which address. The output is actually a bitcoin script. This script also goes through a complex asymmetric encryption, to run this paragraph, to get an output of money, the money as the input of their own transactions, you have to decrypt the script with your own private key, and then run the script, after the script runs, the money can be used as the input of their own transactions. Combined with the previous knowledge, only the corresponding private key can be decrypted, so only the user corresponding to the bitcoin address of the output record can decrypt the script and get the money.

In this process, “script” is a key, in addition to the above simplest transfer logic, but also through some conditions to achieve a slightly more complex programming, for example, only when certain conditions are met, the decrypted script can run. Based on this design, bitcoin’s scripting system could be used to implement functions such as multiple signatures and guaranteed contracts, a precursor to smart contracts.

conclusion

I have just started my research on blockchain, so THERE must be many things I do not fully understand and some misunderstandings. Friends in this field, but to want to know, want you to understand the technology behind block chain (not on technical details thoroughly), read some of the more mature reliable material (I accidentally found already published in the proceedings of computer similar article reviews, the reader can read the block chain technology: Structure and Progress), rather than being taken at face value by hearsay. Once you get the hang of it, you’ll realize that there are a lot of limitations. There are nice things about it, and there are unnecessary things about it.

The original in my blog: http://www.tangshuang.net/4133.html if late and found some mistakes, will be in my blog updated, so please pay attention to my blog.