Author: Guo Yu
ZkPoD principle interpretation series (I) : this article is for the readers who have a certain basis in cryptography, or are interested in cryptography. Although there are a lot of mathematical formulas in the paper, they are relatively simple and easy to understand.
Introduction: What is zkPoD?
ZkPoD realizes decentralized “zero-knowledge conditional payment” and supports zero-trust fair transaction of GB data. For the concept of “zero-knowledge conditional payment” see this overview article “zkPoD: Blockchain, Zero-Knowledge Proof and Formal Verification for Fair Transactions without Intermediaries and Zero Trust”. ZkPoD is a new solution to achieve ZKCP goals. At present, zkPoD supports up to GB data, low TPS common chain and high TPS alliance chain. Support for both binary and tabular data with internally rich types and structures. Compared with the traditional “trusted third party”, zkPoD uses blockchain to act as a “Trustless third party” to achieve “zero-trust fair transaction”.
ZkPoD is also a basic protocol to realize two-way flow of data and value
ZkPoD open source code and more documentation please see: github.com/sec-bit/zkP…
Proof-of Delivery (PoD) protocol
PoD is the core protocol of zkPoD system. PoD protocol realizes the atomic exchange of “data” and “Token” by borrowing blockchain smart contract, and guarantees fairness between buyer and seller. PoD does not adopt a single zkSNARK scheme to implement atomic exchange like ZKCP[1], but instead utilizes classic cryptography schemes such as Pedersen Commitment and Schnorr Protocol. This makes the PoD more efficient and easy to expand. The PoD protocol will also use formal proof to build a solid “trust foundation.”
This article introduces a minimalist PoD protocol, POD-Tiny. This protocol simplifies many details and is not practical, but it can help readers quickly understand the principles and challenges of PoD.
Let’s say I’m the seller and you need to buy a data file from me.
- Step 1: I encrypt the “data” and send it to you
- Step 2: You hand over your Token to a blockchain “smart contract”
- Step 3: I exchange the “key” for the “Token” in the “smart contract”, and then you can read the “key” from the smart contract for “data” decryption
Isn’t that easy? You are smart enough to be highly skeptical that there is something wrong with the process.
Key issues in “fair dealing”
- Key issue (1) : The encrypted data you receive is exactly what you want
- Key question (2) : What if you run away without paying after receiving encrypted data
- Key problem (3) : The key I show to the smart contract must be a real key, or I can’t get the Token
- Key issue (4) : After I show the real key, I must be able to get the Token
Let’s take a closer look at how POD-Tiny solves these problems.
Characteristics of locked data: Authenticator
For key question (1), we need an anchor point for what you want the data to be. For simplicity, let’s assume we’ve agreed on a unique label, or signature, for a data file. Then the data you buy needs to correspond to that label.
In general, people like to Hash features of a string, such as computations
h = MD5("hello,zkPoD!")
Copy the code
Let’s look at the string “Hello,zkPoD!” There are 12 bytes in total, or 96 bits. We can then convert these 12 bytes to an integer in a finite field (we are assuming that the size of the finite field is close to 256 bits). So we can encode this string as an integer, let’s just use a symbol for that integer, let’s say m.
We generate the “promise” of this data by the following operation.
A = m*G
A Commitment, also known as a Commitment, corresponds to the data and hides the value of the data. The A here is called “Authentication information” Authenticator in the zkPoD system. And G here is a generator of a cyclic group of elliptic curves.
“Authenticator” can be made public to everyone, and we don’t have to worry about disclosing raw data information. That’s because it’s hard to figure out m from A. This inverse operation is a logarithmic operation over a finite field. If the finite field is large, this logarithm operation is very, very difficult. This is often referred to as the “discrete logarithm problem” hypothesis. All theoretical details aside, let’s just know that Authenticator can be trusted to anyone without fear of m being reverse-cracked.
Why should “authentication information” take the form of a “promise” instead of the well-known Hash operation? This is because the promise is additive homomorphism. The so-called homomorphism property can be understood as follows: a certain operation of plaintext data can be homomorphically mapped to the ciphertext operation. Suppose we have three data plaintext, m1, m2, and m3, where M1 = m2 + m3.
Their authenticators are calculated as follows:
A1 = m1*G, A2 = m2*G, A3 = m3*G,
We can compute A1, right? = A2 + A3
To verify m1? M2 + m3. As you can see, even if a melon eater knows A1, he can’t figure out M1 backwards. But he still knows that M1 is equal to the sum of the other two numbers, even though he has no idea what those three numbers are.
Note: The addition here is modular addition, a + b is short for a + b mod P. For legibility, subsequent addition, subtraction, multiplication and division conventions are operations over finite fields.
The rest is simple. In PoD protocol, we can choose a random number K as a one-time key to encrypt M and calculate E(m) = K + M. E(m) is encrypted data. I can also send you K equals K times G, so you have three things in your hand, A, K, and E of m. You can then use the following formula to “homomorphically” verify that the encrypted E(m) is indeed the ciphertext of data M:
E(m)*G ? = K + A
And, from the formula above, you know one more key: the key is a number associated with K. Even though you don’t know anything about M or the key K. This information is also the key to solving the key problem (3)**.
To sum up:
Homomorphism allows the buyer to verify that the data meets certain conditions in the case of encryption
Recall the key question (2) :
- Key question (2) : What if you run away without paying after receiving encrypted data
The core method to solve this problem is “zero knowledge proof”. If the buyer can’t analyze any extra information from the encrypted data, the seller’s interests will not be harmed and the key problem will be solved (2). In short, if the encrypted data is zero knowledge, it is not afraid of buyers to take the encrypted data to run away without money. The so-called “zero knowledge”, we can be colloquially understood: buyers get encrypted data, like a bunch of random numbers, without any information. How do you get to zero knowledge? Pod-tiny adopts the idea of the classic Schnorr protocol.
Interstitial Science: Schnorr Protocol and zero Knowledge
The Schnorr protocol is a very classic example of a textbook, and I’m going to walk you through it very quickly. One of the uses of Schnorr protocol is for identity authentication. It is a two-party security protocol in which Alice, the “prover”, proves to Bob, the “verifier”, that she has a private key corresponding to a public key.
First, Alice generates a pair of “public and private keys”, (PK, SK). Bob then holds Alice’s public key PK. When Alice wants to prove her identity to Bob, they will do so through a “three-step interactive protocol” : prove that Alice has the private key SK. If Bob accepts the proof, then Bob thinks that the person who proves the private key is Alice. Here is a brief description of the protocol:
Mmbiz. Qpic. Cn/mmbiz_png/p…
sk = a, pk = a*G
Public input: PK = a*G
Alice’s private input: sk = a
Step 1: Alice selects a random number r and sends the “promise” r = r*G to Bob
Step 2: Bob returns a random number C as the challenge number
Step 3: Alice calculates z = r + C * a, and then sends z to Bob, who verifies it by the following formula:
Verification formula: Z *G? = R + c*PK = r*G + c*a*G
This Schnorr protocol has three properties:
- Completeness Completeness
- Reliability Special Soundness
- Special Honest Verifier zero-knowledge
The third property is “zero knowledge”, which guarantees that Bob never gets any information about the private key during protocol interaction.
Note: Strictly speaking, Schnorr protocol is not “full-ZK”, only “HVZK” is guaranteed, which is a relatively weak zero-knowledge property. But forget about that for a moment. The Schnorr protocol can be upgraded to “full-ZK” with a few tricks.
Pod-tiny: A simple PoD protocol
If you already remember the details of the Schnorr protocol in general, let me show you a protocol called POD-Tiny.
Protocol description: Suppose Alice has A data plaintext M, and Bob has an Authenticator (A) for this data. There is also A “Trustless third party”, we will call her Julia. Remember: she’s a smart contract.
Protocol:
Pre-show props: M, A,G, a random number generator rand()
Role:
- Alice: Have data
m
, one-time keyk <-rand()
- Bob: have
A
- Julia: N/A
Steps:
Step 1: Alice generates a random number, r <-rand(), and then sends Bob two numbers K= K *G and r =r*G
Step 2: Bob generates a random number C <-rand() and sends it to Alice
Step 3: Alice calculates two numbers E(m) = k + m, z = r + C * k, and sends them to Bob. Of the two numbers, E(m) is the ciphertext encrypted with the one-time key K, and Z is the ciphertext of the key.
Note: What? Ciphertext of the key? Yes, here Alice encrypts K with the random number R generated in the first step and the challenge number C provided by Bob in the second step to get Z.
Step 4: Bob verifies the ciphertext E(m) of the received data (Formula (1)) and the ciphertext of the key (Formula (2)) :
- Verification Formula (1) :
E(m) * G ? = A + K
- Verification Formula (2) :
z*G ? = R + c*K
Note: In step 4, Bob needs to figure out two things: first, can the ciphertext data (E(m)) passed to him correspond to the data anchor point (A); Then the ciphertext data (E(m)) is encrypted by an unknown key X, and the ciphertext of this “unknown key” should be equal to the “key ciphertext” (z) sent by Alice in step 3. In this case, at some point in the future, if Bob gets the “key ciphertext key” (r), he can successfully get the plaintext data (m) by doing two decryption operations. The two decryption actions are as follows: First, Bob decrypts the ciphertext Z with the “key ciphertext key” R and the challenge number C to obtain the data key K, and then decrypts the data ciphertext E(m) with the data key to obtain the plaintext M.
Note again: The above protocol steps 1 through 4 are actually very similar to the Schnorr protocol. It’s just replacing a with a one-time key K. However, another difference is that K = K *G is equivalent to the public key in the original Schnorr protocol, which is not initially sent to Bob, but is sent to Bob along with R in the first step of the protocol. However, taken as a whole, the four-step protocol is an extension of the Schnorr protocol. Of course, this is not the end of the story. Next, blockchain comes into play. Bob, Alice and Julia start interacting.
Step 5: If Bob verifies formula (1) and formula (2) in step 4, it means that the encrypted data sent by Alice is correct. Bob should send Julia a “delivery-receipt”. In this step, Bob needs to save the “Token” to Julia.
Julia: This receipt is to tell Bob that he received the encrypted data, but the key hasn’t arrived yet. Julia needs to receive and verify the key for him. So what are the credentials for validation? This receipt is the R that Alice sent to Bob in the first step of the protocol.
Step 6: Alice shows Julia “key ciphertext key”, that is, R. Julia check the key formula below. If the verification is successful, Julia can transfer Bob’s Token to Alice.
- Verification Formula (3) :
R ? = r*G
Let’s see how the key problem (3) is solved.
- Key problem (3) : The key Alice shows to the smart contract must be a real key; otherwise, the Token cannot be obtained
In the first step of the protocol, Alice sends Bob a “promise” of a “key of a key”. Then in step 5 of the agreement, Bob hands over R to Julia; In step 6, Alice makes good on her promise to reveal the corresponding R. If Alice presents an incorrect value, Julia will immediately find that the statement (3) is not true.
One more:
- Key issue (4) : After Alice presents the real key, she must be able to get the Token
In step 6 of the agreement, Julia tests formula (3). In the case that Alice shows the correct r, if the equation is not true, then there are only two cases :(1) Julia deliberately makes trouble, and (2) the value of r is incorrect. For the former case, it is necessary to ensure that Julia’s contract code does not have vulnerabilities and functions normally, which needs to be solved by “formal verification” in addition. In the latter case, Alice is required to check Julia’s internal state in step 6, and whether the R submitted by Bob in step 5 is a correct value. Please note here: Julia is a public smart contract, any data she holds is publicly visible, and any of her internal states and calculations are publicly visible.
Protocol security and fairness analysis
If we don’t consider multiple transactions, POD-Tiny is a “fair” trading protocol. Let’s take a look at why this agreement is fair.
Let’s first consider what methods Alice uses to cheat:
- A1. Fake data
m'
Encrypt it and pass it to Bob - A2. Key used to encrypt data
k
“, but encrypts the key using a different onek'
And,k <> k'
- A3. Show Julia a fake ciphertext key
r'
Analysis: If Alice uses cheating strategy A1, Bob can find it when checking formula (1); If Alice uses cheating strategy A2, Bob can find it by calculating formula (2). If Alice uses cheating strategy A3, Julia can find it by formula (3).
Let’s consider how Bob cheats:
- B1. Bob is getting encrypted data
E(m)
After that, you pull out of the protocol and try to break the ciphertext - B2. Bob shows Julia a wrong “delivery receipt” after verifying the encrypted data.
- B3. Bob account does not have enough tokens
If Bob uses cheat B1, then Bob can’t get any information from the encrypted data because the first three steps of the protocol are “Honest Verifier zero-knowledge”. If Bob adopts means B2, Alice can check whether the “data delivery receipt” R in Julia’s hand is the same as the one she sent to Bob in the first step in step 6. Once Bob submits the wrong receipt, Alice can directly withdraw from the agreement and refuse to present the key. Similarly, if Bob uses B3, Alice can check whether Bob’s Token stored in Julia is enough at step 6, if not, she can directly withdraw from the agreement.
Finally, is it possible for Julia to cheat? Julia is a smart contract, and any behavior and internal state of her can be read by anyone. Therefore, information may be leaked through Julia, which is disadvantageous to Alice or Bob. But please note that Julia does not actually contact any information related to the data plaintext M, so she will not disclose the information of M from the chain. Julia only had access to two pieces of information, R and R.
Compress to the simplest protocol
Let’s count the interaction steps of the above protocol. There are five steps in total, namely, Alice interacts with Bob three times, Bob interacts with Julia once, and Alice interacts with Julia once. The security protocol has a tool called fiat-Shamir Heuristic transformations that “compress” the first three interactions in the POD-Tiny protocol directly into one.
Before compression:
Step | Interaction | Info |
---|---|---|
The first step | Alice => Bob | (K, R) |
The second step | Bob => Alice | c |
The third step | Alice => Bob | (E(m), z) |
The fourth step | Bob => Julia | R |
Step 5 | Alice => Julia | r |
After the compression:
Step | Interaction | Info |
---|---|---|
The first step | Alice => Bob | (K, R, E(m), z) Here,c = Hash(A || K || R) |
The second step | Bob => Julia | R |
The third step | Alice => Julia | r |
The main difference we found is that in the compressed POD-Tiny, the number of challenges is no longer generated by Bob, but by Alice. Now, you might wonder, is that unfair to Bob? This is the equivalent of a three-step Schnorr protocol compressed into one step. Here’s the conclusion: the compressed agreement remains zero-knowledge and still fair to both parties. The reason is that the protocol before compression can prove HVZK(Honest Verifier zero-knowledge); The compressed protocol can prove NIZK (non-interactive zero-knowledge). However, the comparison of security before and after compression will be Subtle, which will not be expanded here.
Compressed, the resulting protocol is incredibly compact:
Moving toward practical challenges: security and performance
The simplest protocol, POD-Tiny, is only the first step in a long journey, and there are many problems when it comes to turning theory into code in the complicated real world. These issues intertwine with each other and in turn affect the theoretical design of the protocol.
- How to support data over 1MB in length, even up to GB
- How to effectively reduce the overhead of on-chain validation computation
- How to support Ethereum and be immune to various security issues on Ethereum
- How to support complex homomorphic computation of data
In terms of safety and performance, zkPoD has made quite a few engineering improvements and innovations. If you are interested, please pay attention to our subsequent articles.
Write in the last
What exactly can blockchain do? I’ve seen a lot of pretty “pessimistic” talk over the last year, and I think the PoD agreement should give those skeptics some inspiration. Blockchain plays a key role as a “Trustless third party” in the PoD protocol and makes it incredibly compact, which we didn’t expect.
reference
- [1] Maxwell, G. “Zero Knowledge contingent Payment. 2011.” URl: en.bitcoii.it /wiki/Zero_K… (visited on 05/01/2016) (2016).
- [2] Greg Maxwell. “The first successful Zero – Knowledge Contingent Payment.” bitcoincore.org/en/2016/02/…